You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/how/restore-indices.md
+169-10
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,22 @@
1
1
# Restoring Search and Graph Indices from Local Database
2
2
3
-
If search or graph services go down or you have made changes to them that require reindexing, you can restore them from
4
-
the aspects stored in the local database.
3
+
If search infrastructure (Elasticsearch/Opensearch) or graph services (Elasticsearch/Opensearch/Neo4j) become inconsistent,
4
+
you can restore them from the aspects stored in the local database.
5
5
6
-
When a new version of the aspect gets ingested, GMS initiates an MAE event for the aspect which is consumed to update
6
+
When a new version of the aspect gets ingested, GMS initiates an MCL event for the aspect which is consumed to update
7
7
the search and graph indices. As such, we can fetch the latest version of each aspect in the local database and produce
8
-
MAE events corresponding to the aspects to restore the search and graph indices.
8
+
MCL events corresponding to the aspects to restore the search and graph indices.
9
9
10
10
By default, restoring the indices from the local database will not remove any existing documents in
11
11
the search and graph indices that no longer exist in the local database, potentially leading to inconsistencies
12
12
between the search and graph indices and the local database.
13
13
14
14
## Configuration
15
15
16
-
The upgrade jobs take arguments as command line args to the job itself rather than environment variables for job specific configuration. The RestoreIndices job is specified through the `-u RestoreIndices` upgrade ID parameter and then additional parameters are specified like `-a batchSize=1000`.
16
+
The upgrade jobs take arguments as command line args to the job itself rather than environment variables for job specific
17
+
configuration. The RestoreIndices job is specified through the `-u RestoreIndices` upgrade ID parameter and then additional
18
+
parameters are specified like `-a batchSize=1000`.
19
+
17
20
The following configurations are available:
18
21
19
22
### Time-Based Filtering
@@ -43,7 +46,9 @@ The following configurations are available:
43
46
44
47
These are available in the helm charts as configurations for Kubernetes deployments under the `datahubUpgrade.restoreIndices.args` path which will set them up as args to the pod command.
45
48
46
-
## Quickstart
49
+
## Execution Methods
50
+
51
+
### Quickstart
47
52
48
53
If you're using the quickstart images, you can use the `datahub` cli to restore the indices.
49
54
@@ -57,7 +62,7 @@ Using the `datahub` CLI to restore the indices when using the quickstart images
57
62
58
63
See [this section](../quickstart.md#restore-datahub) for more information.
59
64
60
-
## Docker-compose
65
+
###Docker-compose
61
66
62
67
If you are on a custom docker-compose deployment, run the following command (you need to checkout [the source repository](https://github.com/datahub-project/datahub)) from the root of the repo to send MAE for each aspect in the local database.
63
68
@@ -78,7 +83,7 @@ If you need to clear the search and graph indices before restoring, add `-a clea
78
83
Refer to this [doc](../../docker/datahub-upgrade/README.md#environment-variables) on how to set environment variables
79
84
for your environment.
80
85
81
-
## Kubernetes
86
+
###Kubernetes
82
87
83
88
Run `kubectl get cronjobs` to see if the restoration job template has been deployed. If you see results like below, you
84
89
are good to go.
@@ -120,6 +125,160 @@ datahubUpgrade:
120
125
- "clean"
121
126
```
122
127
123
-
## Through API
128
+
### Through APIs
129
+
130
+
See also the [Best Practices](#best-practices) section below, however note that the APIs are able to handle a few thousand
131
+
aspects. In this mode one of the GMS instances will perform the required actions, however it is subject to timeout. Use one of the
132
+
approaches above for longer running restoreIndices.
133
+
134
+
#### OpenAPI
135
+
136
+
There are two primary APIs, one which exposes the common parameters for restoreIndices and another one designed
137
+
to accept a list of URNs where all aspects are to be restored.
0 commit comments