Skip to content

Commit 790dd33

Browse files
michael2893jpkrohlingtiffany76opentelemetrybotcartermp
authored
Add single writer principle note to deployment documentation (#5166)
Co-authored-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Tiffany Hrabusa <[email protected]> Co-authored-by: opentelemetrybot <[email protected]> Co-authored-by: Phillip Carter <[email protected]>
1 parent 333d350 commit 790dd33

File tree

1 file changed

+42
-0
lines changed
  • content/en/docs/collector/deployment/gateway

1 file changed

+42
-0
lines changed

content/en/docs/collector/deployment/gateway/index.md

+42
Original file line numberDiff line numberDiff line change
@@ -251,3 +251,45 @@ Cons:
251251
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor
252252
[spanmetrics-connector]:
253253
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector
254+
255+
## Multiple collectors and the single-writer principle
256+
257+
All metric data streams within OTLP must have a
258+
[single writer](/docs/specs/otel/metrics/data-model/#single-writer). When
259+
deploying multiple collectors in a gateway configuration, it's important to
260+
ensure that all metric data streams have a single writer and a globally unique
261+
identity.
262+
263+
### Potential problems
264+
265+
Concurrent access from multiple applications that modify or report on the same
266+
data can lead to data loss or degraded data quality. For example, you might see
267+
inconsistent data from multiple sources on the same resource, where the
268+
different sources can overwrite each other because the resource is not uniquely
269+
identified.
270+
271+
There are patterns in the data that may provide some insight into whether this
272+
is happening or not. For example, upon visual inspection, a series with
273+
unexplained gaps or jumps in the same series may be a clue that multiple
274+
collectors are sending the same samples. You might also see errors in your
275+
backend. For example, with a Prometheus backend:
276+
277+
`Error on ingesting out-of-order samples`
278+
279+
This error could indicate that identical targets exist in two jobs, and the
280+
order of the timestamps is incorrect. For example:
281+
282+
- Metric `M1` received at `T1` with a timestamp 13:56:04 with value `100`
283+
- Metric `M1` received at `T2` with a timestamp 13:56:24 with value `120`
284+
- Metric `M1` received at `T3` with a timestamp 13:56:04 with value `110`
285+
- Metric `M1` received at time 13:56:24 with value `120`
286+
- Metric `M1` received at time 13:56:04 with value `110`
287+
288+
### Best practices
289+
290+
- Use the
291+
[Kubernetes attributes processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor)
292+
to add labels to different Kubernetes resources.
293+
- Use the
294+
[resource detector processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/README.md)
295+
to detect resource information from the host and collect resource metadata.

0 commit comments

Comments
 (0)