You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/advanced/patch.md
+78-32
Original file line number
Diff line number
Diff line change
@@ -1,69 +1,120 @@
1
1
import Tabs from '@theme/Tabs';
2
2
import TabItem from '@theme/TabItem';
3
3
4
-
# But First, Semantics: Upsert versus Patch
4
+
# Emitting Patch Updates to DataHub
5
5
6
6
## Why Would You Use Patch
7
7
8
-
By default, most of the SDK tutorials and API-s involve applying full upserts at the aspect level. This means that typically, when you want to change one field within an aspect without modifying others, you need to do a read-modify-write to not overwrite existing fields.
9
-
To support these scenarios, DataHub supports PATCH based operations so that targeted changes to single fields or values within arrays of fields are possible without impacting other existing metadata.
8
+
By default, most of the SDK tutorials and APIs involve applying full upserts at the aspect level, e.g. replacing the aspect entirely.
9
+
This means that when you want to change even a single field within an aspect without modifying others, you need to do a read-modify-write to avoid overwriting existing fields.
10
+
To support these scenarios, DataHub supports `PATCH` operations to perform targeted changes for individual fields or values within arrays of fields are possible without impacting other existing metadata.
10
11
11
12
:::note
12
13
13
-
Currently, PATCH support is only available for a selected set of aspects, so before pinning your hopes on using PATCH as a way to make modifications to aspect values, confirm whether your aspect supports PATCH semantics. The complete list of Aspects that are supported are maintained [here](https://github.com/datahub-project/datahub/blob/9588440549f3d99965085e97b214a7dabc181ed2/entity-registry/src/main/java/com/linkedin/metadata/models/registry/template/AspectTemplateEngine.java#L24). In the near future, we do have plans to automatically support PATCH semantics for aspects by default.
14
+
Currently, PATCH support is only available for a selected set of aspects, so before pinning your hopes on using PATCH as a way to make modifications to aspect values, confirm whether your aspect supports PATCH semantics. The complete list of Aspects that are supported are maintained [here](https://github.com/datahub-project/datahub/blob/9588440549f3d99965085e97b214a7dabc181ed2/entity-registry/src/main/java/com/linkedin/metadata/models/registry/template/AspectTemplateEngine.java#L24).
14
15
15
16
:::
16
17
17
-
## How To Use Patch
18
+
## How To Use Patches
18
19
19
-
Examples for using Patch are sprinkled throughout the API guides.
20
20
Here's how to find the appropriate classes for the language for your choice.
21
21
22
-
23
22
<Tabs>
24
-
<TabItemvalue="Java"label="Java SDK">
23
+
<TabItemvalue="Python"label="Python SDK"default>
25
24
26
-
The Java Patch builders are aspect-oriented and located in the [datahub-client](https://github.com/datahub-project/datahub/tree/master/metadata-integration/java/datahub-client/src/main/java/datahub/client/patch) module under the `datahub.client.patch` namespace.
25
+
The Python Patch builders are entity-oriented and located in the [metadata-ingestion](https://github.com/datahub-project/datahub/tree/9588440549f3d99965085e97b214a7dabc181ed2/metadata-ingestion/src/datahub/specific) module and located in the `datahub.specific` module.
26
+
Patch builder helper classes exist for
27
27
28
-
Here are a few illustrative examples using the Java Patch builders:
### Add & Remove Structured Properties for Dataset
51
73
52
-
The Python Patch builders are entity-oriented and located in the [metadata-ingestion](https://github.com/datahub-project/datahub/tree/9588440549f3d99965085e97b214a7dabc181ed2/metadata-ingestion/src/datahub/specific) module and located in the `datahub.specific` module.
74
+
To add & remove structured properties for a dataset:
53
75
54
-
Here are a few illustrative examples using the Python Patch builders:
The Java Patch builders are aspect-oriented and located in the [datahub-client](https://github.com/datahub-project/datahub/tree/master/metadata-integration/java/datahub-client/src/main/java/datahub/client/patch) module under the `datahub.client.patch` namespace.
To understand how patching works, it's important to understand a bit about our [models](../what/aspect.md). Entities are comprised of Aspects
69
120
which can be reasoned about as JSON representations of the object models. To be able to patch these we utilize [JsonPatch](https://jsonpatch.com/). The components of a JSON Patch are the path, operation, and value.
@@ -73,9 +124,6 @@ which can be reasoned about as JSON representations of the object models. To be
73
124
The JSON path refers to a value within the schema. This can be a single field or can be an entire object reference depending on what the path is.
74
125
For our patches we are primarily targeting single fields or even single array elements within a field. To be able to target array elements by id, we go through a translation process
75
126
of the schema to transform arrays into maps. This allows a path to reference a particular array element by key rather than by index, for example a specific tag urn being added to a dataset.
76
-
This is important to note that for some fields in our schema that are arrays which do not necessarily restrict uniqueness, this puts a uniqueness constraint on the key.
77
-
The key for objects stored in arrays is determined manually by examining the schema and a long term goal is to make these keys annotation driven to reduce the amount of code needed to support
78
-
additional aspects to be patched. There is a generic patch endpoint, but it requires any array field keys to be specified at request time, putting a lot of burden on the API user.
79
127
80
128
#### Examples
81
129
@@ -87,8 +135,7 @@ Breakdown:
87
135
*`/upstreams` -> References the upstreams field of the UpstreamLineage aspect, this is an array of Upstream objects where the key is the Urn
88
136
*`/urn:...` -> The dataset to be targeted by the operation
89
137
90
-
91
-
A patch path for targeting a fine grained lineage upstream:
138
+
A patch path for targeting a fine-grained lineage upstream:
@@ -118,7 +165,6 @@ using adds, but generally the most useful use case for patch is to add elements
118
165
119
166
Remove operations require the path specified to be present, or an error will be thrown, otherwise they operate as one would expect. The specified path will be removed from the aspect.
120
167
121
-
122
168
### Value
123
169
124
170
Value is the actual information that will be stored at a path. If the path references an object then this will include the JSON key value pairs for that object.
0 commit comments