|
| 1 | +--- |
| 2 | +title: The State of Profiling |
| 3 | +linkTitle: Profiling state |
| 4 | +date: 2024-10-25 |
| 5 | +cSpell:ignore: Baeyens Florian Geisendörfer Kalkanis Lehner Mathieu Rühsen |
| 6 | +author: >- |
| 7 | + [Damien Mathieu](https://github.com/dmathieu) (Elastic), [Pablo |
| 8 | + Baeyens](https://github.com/mx-psi) (Datadog), [Felix |
| 9 | + Geisendörfer](https://github.com/felixge) (Datadog), [Christos |
| 10 | + Kalkanis](https://github.com/christos68k) (Elastic), [Morgan |
| 11 | + McLean](https://github.com/mtwo) (Splunk), [Florian |
| 12 | + Lehner](https://github.com/florianl) (Elastic), [Tim |
| 13 | + Rühsen](https://github.com/rockdaboot) (Elastic) |
| 14 | +issue: https://github.com/open-telemetry/opentelemetry.io/issues/5477 |
| 15 | +sig: Profiling SIG |
| 16 | +--- |
| 17 | + |
| 18 | +A little over six months ago, OpenTelemetry announced |
| 19 | +[support for the profiling signal](/blog/2024/profiling/). While the signal is |
| 20 | +still in development and isn’t yet recommended for production use, the Profiling |
| 21 | +SIG has made substantial progress on many fronts. |
| 22 | + |
| 23 | +This post provides a summary of the progress the Profiling SIG has made over the |
| 24 | +past six months. |
| 25 | + |
| 26 | +## OTLP improvements |
| 27 | + |
| 28 | +Profiles were added as a new signal type to OTLP in |
| 29 | +[v1.3.0](https://github.com/open-telemetry/opentelemetry-proto/releases/tag/v1.3.0), |
| 30 | +though this area is still marked as unstable as we continue to make changes to |
| 31 | +it. |
| 32 | + |
| 33 | +While our original intent was to keep wire compatibility with |
| 34 | +[pprof](https://github.com/google/pprof), that goal proved impractical, so the |
| 35 | +Profiling SIG |
| 36 | +[has decided](https://github.com/open-telemetry/opentelemetry-proto/issues/567#issuecomment-2286565449) |
| 37 | +to refactor the protocol and not aim for strict compatibility with pprof. |
| 38 | +Instead, we will aim for convertibility, similarly to what we already do for |
| 39 | +other signals. This shift is still a work in progress, and is causing several |
| 40 | +breaking changes to the profiling section of the protocol. Note that this has no |
| 41 | +impact on the stable sections that make up the majority of the OTLP protocol, |
| 42 | +like metrics, spans, logs, resources, etc. |
| 43 | + |
| 44 | +## eBPF agent improvements |
| 45 | + |
| 46 | +Back in June, the |
| 47 | +[donation of the Elastic Continuous Profiling Agent](/blog/2024/elastic-contributes-continuous-profiling-agent/) |
| 48 | +was finalized. Since then, the |
| 49 | +[opentelemetry-ebpf-profiler](https://github.com/open-telemetry/opentelemetry-ebpf-profiler) |
| 50 | +repository has been buzzing with improvements. |
| 51 | + |
| 52 | +Our next goal for the eBPF agent is for it to run as a Collector receiver. Once |
| 53 | +this is complete, the Collector can be run on every node as an agent, which |
| 54 | +collects profiles for that host and forwards them using OTLP. This architecture |
| 55 | +will allow us to extract some specific parts of the agent that aren’t strictly |
| 56 | +profiling, such as retrieving host metadata and system metrics, and move them to |
| 57 | +processors, making the agent lighter and more modular. |
| 58 | + |
| 59 | +## Collector support |
| 60 | + |
| 61 | +Since |
| 62 | +[v0.112.0](https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.112.0), |
| 63 | +the OpenTelemetry Collector is able to receive, process and export profiling |
| 64 | +data, and has support for profile ingestion and export using OTLP. |
| 65 | + |
| 66 | +You can try it out by enabling the `service.profilesSupport` |
| 67 | +[feature gate](https://github.com/open-telemetry/opentelemetry-collector/blob/main/featuregate/README.md#controlling-gates) |
| 68 | +in your collector, followed by a configuration similar to the following, which |
| 69 | +ingests and exports data using OTLP: |
| 70 | + |
| 71 | +```yaml |
| 72 | +receivers: |
| 73 | + otlp: |
| 74 | + protocols: |
| 75 | + grpc: |
| 76 | +exporters: |
| 77 | + otlp: |
| 78 | + endpoint: 'localhost:4317' |
| 79 | +service: |
| 80 | + pipelines: |
| 81 | + profiles: |
| 82 | + receivers: [otlp] |
| 83 | + exporters: [otlp] |
| 84 | +``` |
| 85 | +
|
| 86 | +While this feature can be used now on the Collector, we do not yet recommend |
| 87 | +doing so in production: it is still under heavy development and is expected to |
| 88 | +have breaking changes, such as the ones mentioned above with OTLP. |
| 89 | +
|
| 90 | +However, this support in the Collector means that any receiver, processor or |
| 91 | +exporter of the Collector can now start adding profiles support, which we highly |
| 92 | +encourage to do, as a way to allow a smoother integration in the future, as well |
| 93 | +as to find potential issues early. If you wish to report a bug or contribute on |
| 94 | +this effort, you can |
| 95 | +[view them on the contrib repository](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22+label%3Adata%3Aprofiles). |
| 96 | +
|
| 97 | +## Semantic Conventions and Specification |
| 98 | +
|
| 99 | +To improve interoperability, the Profiling SIG worked also on |
| 100 | +[OpenTelemetry Semantic Conventions for profiling](/docs/specs/semconv/attributes-registry/profile/). |
| 101 | +There is also ongoing work to introduce a |
| 102 | +[profiling OpenTelemetry specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4197). |
| 103 | +This work will continue and should enable wide adoption across different |
| 104 | +platforms, tools and other OTel signals. |
| 105 | +
|
| 106 | +## What’s next ? |
| 107 | +
|
| 108 | +Support for profiles in OpenTelemetry is moving very quickly, and while we’re |
| 109 | +still far from being able to provide a stable signal, we are happy to report |
| 110 | +that folks can start hacking with it, and integrate it within their modules. |
| 111 | +
|
| 112 | +If you’re interested in helping profiling move forward, or face issues when |
| 113 | +integrating with it, the Profiling SIG is always happy to get or provide help. |
| 114 | +
|
| 115 | +You can find us on |
| 116 | +[#otel-profiles](https://cloud-native.slack.com/archives/C03J794L0BV) in the |
| 117 | +CNCF slack. |
0 commit comments