Skip to content

Commit 84bb6ec

Browse files
authored
Small grammatical changes to Sampling document (open-telemetry#4939)
1 parent 0238bdd commit 84bb6ec

File tree

1 file changed

+29
-29
lines changed
  • content/en/docs/concepts/sampling

1 file changed

+29
-29
lines changed

content/en/docs/concepts/sampling/index.md

+29-29
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,34 @@
11
---
22
title: Sampling
33
description:
4-
Learn about sampling, and the different sampling options available in
4+
Learn about sampling and the different sampling options available in
55
OpenTelemetry.
66
weight: 80
77
---
88

9-
With distributed tracing, you observe requests as they move from one service to
10-
another in a distributed system. Its superbly practical for a number of
9+
With distributed tracing, you can observe requests as they move from one service
10+
to another in a distributed system. It's superbly practical for a number of
1111
reasons, such as understanding your service connections and diagnosing latency
1212
issues, among many other benefits.
1313

14-
However, if the majority of all your requests are successful 200s and finish
15-
without unacceptable latency or errors, do you really need all that data? Here’s
16-
the thing—you don’t always need a ton of data to find the right insights. _You
17-
just need the right sampling of data._
14+
However, if the majority of your requests are successful 200s and finish without
15+
unacceptable latency or errors, do you really need all that data? Here’s the
16+
thing—you don’t always need a ton of data to find the right insights. _You just
17+
need the right sampling of data._
1818

1919
![Illustration shows that not all data needs to be traced, and that a sample of data is sufficient.](traces-venn-diagram.svg)
2020

2121
The idea behind sampling is to control the spans you send to your observability
2222
backend, resulting in lower ingest costs. Different organizations will have
23-
their own reasons for not just _why_ they want to sample, but also _what_ they
23+
their own reasons for not just _why_ they want to sample but also _what_ they
2424
want to sample. You might want to customize your sampling strategy to:
2525

2626
- **Manage costs**: If you have a high volume of telemetry, you risk incurring
2727
heavy charges from a telemetry backend vendor or cloud provider to export and
2828
store every span.
29-
- **Focus on interesting traces**: For example, your frontend team may only want
30-
to see traces with specific user attributes.
31-
- **Filter out noise**: For example, you may want to filter out health checks.
29+
- **Focus on interesting traces**: For example, your frontend team might only
30+
want to see traces with specific user attributes.
31+
- **Filter out noise**: For example, you might want to filter out health checks.
3232

3333
## Terminology
3434

@@ -41,8 +41,8 @@ or span is considered "sampled" or "not sampled":
4141
- **Not sampled**: A trace or span is not processed or exported. Because it is
4242
not chosen by the sampler, it is considered "not sampled".
4343

44-
Sometimes, the definitions of these terms get mixed up. You may find someone
45-
state that they are "sampling out data" or that data not processed or exported
44+
Sometimes, the definitions of these terms get mixed up. You might find someone
45+
states that they are "sampling out data" or that data not processed or exported
4646
is considered "sampled". These are incorrect statements.
4747

4848
## Head Sampling
@@ -53,8 +53,8 @@ inspecting the trace as a whole.
5353

5454
For example, the most common form of head sampling is
5555
[Consistent Probability Sampling](/docs/specs/otel/trace/tracestate-probability-sampling/#consistent-probability-sampling).
56-
It may also be referred to as Deterministic Sampling. In this case, a sampling
57-
decision is made based on the trace ID and a desired percentage of traces to
56+
This is also be referred to as Deterministic Sampling. In this case, a sampling
57+
decision is made based on the trace ID and the desired percentage of traces to
5858
sample. This ensures that whole traces are sampled - no missing spans - at a
5959
consistent rate, such as 5% of all traces.
6060

@@ -65,12 +65,12 @@ The upsides to head sampling are:
6565
- Efficient
6666
- Can be done at any point in the trace collection pipeline
6767

68-
The primary downside to head sampling is that it is not possible make a sampling
69-
decision based on data in the entire trace. This means that head sampling is
70-
effective as a blunt instrument, but is wholly insufficient for sampling
71-
strategies that must take whole-system information into account. For example, it
72-
is not possible to use head sampling to ensure that all traces with an error
73-
within them are sampled. For this, you need Tail Sampling.
68+
The primary downside to head sampling is that it is not possible to make a
69+
sampling decision based on data in the entire trace. This means that head
70+
sampling is effective as a blunt instrument, but is wholly insufficient for
71+
sampling strategies that must take whole-system information into account. For
72+
example, it is not possible to use head sampling to ensure that all traces with
73+
an error within them are sampled. For this, you need Tail Sampling.
7474

7575
## Tail Sampling
7676

@@ -92,7 +92,7 @@ Some examples of how you can use Tail Sampling include:
9292

9393
As you can see, tail sampling allows for a much higher degree of sophistication.
9494
For larger systems that must sample telemetry, it is almost always necessary to
95-
use Tail Sampling to balance data volume with usefulness of that data.
95+
use Tail Sampling to balance data volume with the usefulness of that data.
9696

9797
There are three primary downsides to tail sampling today:
9898

@@ -105,19 +105,19 @@ There are three primary downsides to tail sampling today:
105105
tail sampling must be stateful systems that can accept and store a large
106106
amount of data. Depending on traffic patterns, this can require dozens or even
107107
hundreds of nodes that all utilize resources differently. Furthermore, a tail
108-
sampler may need to "fall back" to less computationally-intensive sampling
108+
sampler might need to "fall back" to less computationally intensive sampling
109109
techniques if it is unable to keep up with the volume of data it is receiving.
110-
Because of these factors, it is critical to monitor tail sampling components
110+
Because of these factors, it is critical to monitor tail-sampling components
111111
to ensure that they have the resources they need to make the correct sampling
112112
decisions.
113113
- Tail samplers often end up being in the domain of vendor-specific technology
114114
today. If you're using a paid vendor for Observability, the most effective
115-
tail sampling options available to you may be limited to what the vendor
115+
tail sampling options available to you might be limited to what the vendor
116116
offers.
117117

118-
Finally, for some systems, tail sampling may be used in conjunction with Head
118+
Finally, for some systems, tail sampling might be used in conjunction with Head
119119
Sampling. For example, a set of services that produce an extremely high volume
120-
of trace data may first use head sampling to only sample a small percentage of
120+
of trace data might first use head sampling to sample only a small percentage of
121121
traces, and then later in the telemetry pipeline use tail sampling to make more
122122
sophisticated sampling decisions before exporting to a backend. This is often
123123
done in the interest of protecting the telemetry pipeline from being overloaded.
@@ -133,7 +133,7 @@ The OpenTelemetry Collector includes the following sampling processors:
133133

134134
### Language SDKs
135135

136-
For the individual language specific implementations of the OpenTelemetry API &
137-
SDK you will find support for sampling at the respective documentation pages:
136+
For the individual language-specific implementations of the OpenTelemetry API &
137+
SDK, you will find support for sampling in the respective documentation pages:
138138

139139
{{% sampling-support-list " " %}}

0 commit comments

Comments
 (0)