You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/lineage/prefect.md
+90-4
Original file line number
Diff line number
Diff line change
@@ -8,13 +8,13 @@ DataHub supports integration of
8
8
9
9
## What is Prefect Datahub Block?
10
10
11
-
Blocks are primitive within Prefect that enable the storage of configuration and provide an interface for interacting with external systems. We integrated [prefect-datahub](https://prefecthq.github.io/prefect-datahub/) block which use [Datahub Rest](../../metadata-ingestion/sink_docs/datahub.md#datahub-rest) emitter to emit metadata events while running prefect flow.
11
+
Blocks are primitive within Prefect that enable the storage of configuration and provide an interface for interacting with external systems. We integrated `prefect-datahub` block which use [Datahub Rest](../../metadata-ingestion/sink_docs/datahub.md#datahub-rest) emitter to emit metadata events while running prefect flow.
12
12
13
13
## Prerequisites to use Prefect Datahub Block
14
14
15
15
1. You need to use either Prefect Cloud (recommended) or the self hosted Prefect server.
16
-
2. Refer [Cloud Quickstart](https://docs.prefect.io/2.10.13/cloud/cloud-quickstart/) to setup Prefect Cloud.
4. Make sure the Prefect api url is set correctly. You can check it by running below command:
19
19
```shell
20
20
prefect profile inspect
@@ -24,7 +24,93 @@ prefect profile inspect
24
24
25
25
## Setup
26
26
27
-
For setup details please refer [prefect-datahub](https://prefecthq.github.io/prefect-datahub/).
27
+
### Installation
28
+
29
+
Install `prefect-datahub` with `pip`:
30
+
31
+
```shell
32
+
pip install 'prefect-datahub'
33
+
```
34
+
35
+
Requires an installation of Python 3.7+.
36
+
37
+
### Saving configurations to a block
38
+
39
+
This is a one-time activity, where you can save the configuration on the [Prefect block document store](https://docs.prefect.io/latest/concepts/blocks/#saving-blocks).
40
+
While saving you can provide below configurations. Default value will get set if not provided while saving the configuration to block.
env | `str` | *PROD* | The environment that all assets produced by this orchestrator belong to. For more detail and possible values refer [here](https://datahubproject.io/docs/graphql/enums/#fabrictype).
46
+
platform_instance | `str` | *None* | The instance of the platform that all assets produced by this recipe belong to. For more detail please refer [here](https://datahubproject.io/docs/platform-instances/).
47
+
48
+
```python
49
+
from prefect_datahub.datahub_emitter import DatahubEmitter
50
+
DatahubEmitter(
51
+
datahub_rest_url="http://localhost:8080",
52
+
env="PROD",
53
+
platform_instance="local_prefect"
54
+
).save("BLOCK-NAME-PLACEHOLDER")
55
+
```
56
+
57
+
Congrats! You can now load the saved block to use your configurations in your Flow code:
58
+
59
+
```python
60
+
from prefect_datahub.datahub_emitter import DatahubEmitter
61
+
DatahubEmitter.load("BLOCK-NAME-PLACEHOLDER")
62
+
```
63
+
64
+
!!! info "Registering blocks"
65
+
66
+
Register blocks in this module to
67
+
[view and edit them](https://docs.prefect.io/ui/blocks/)
68
+
on Prefect Cloud:
69
+
70
+
```bash
71
+
prefect block register -m prefect_datahub
72
+
```
73
+
74
+
### Load the saved block in prefect workflows
75
+
76
+
After installing `prefect-datahub` and [saving the configution](#saving-configurations-to-a-block), you can easily use it within your prefect workflows to help you emit metadata event as show below!
77
+
78
+
```python
79
+
from prefect import flow, task
80
+
from prefect_datahub.dataset import Dataset
81
+
from prefect_datahub.datahub_emitter import DatahubEmitter
**Note**: To emit the tasks, user compulsory need to emit flow. Otherwise nothing will get emit.
101
+
102
+
## Concept mapping
103
+
104
+
Prefect concepts are documented [here](https://docs.prefect.io/latest/concepts/), and datahub concepts are documented [here](https://datahubproject.io/docs/what-is-datahub/datahub-concepts).
0 commit comments