Skip to content

Commit 0a9040b

Browse files
horovitssvrnmtiffany76opentelemetrybotchalin
authored
Blog cncf.io repost by OTel CI/CD SIG (#5718)
Co-authored-by: Severin Neumann <[email protected]> Co-authored-by: Tiffany Hrabusa <[email protected]> Co-authored-by: opentelemetrybot <[email protected]> Co-authored-by: Severin Neumann <[email protected]> Co-authored-by: Patrice Chalin <[email protected]> Co-authored-by: Patrice Chalin <[email protected]>
1 parent f19976e commit 0a9040b

File tree

3 files changed

+386
-0
lines changed

3 files changed

+386
-0
lines changed
Loading
+290
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
---
2+
title: OpenTelemetry Is Expanding Into CI/CD Observability
3+
linkTitle: OpenTelemetry Is Expanding Into CI/CD Observability
4+
date: 2025-02-24
5+
author: >-
6+
[Dotan Horovits](https://github.com/horovits/) (CNCF Ambassador), [Adriel
7+
Perkins](https://github.com/adrielp) (Liatrio)
8+
canonical_url: https://www.cncf.io/blog/2024/11/04/opentelemetry-is-expanding-into-ci-cd-observability/
9+
issue: 5546
10+
sig: CI/CD Observability
11+
# prettier-ignore
12+
cSpell:ignore: andrzej bäck bäckmark chacin cicd frittoli grassi helmuth horovits jemmic joao kamphaus keptn kowalski liatrio liudmila molkova robb ruech safyan sarahan shkuro skyscanner slsa stencel suereth tekton voss
13+
---
14+
15+
We’ve been talking about the need for a common “language” for reporting and
16+
observing CI/CD pipelines for years, and finally, we see the first “words” of
17+
this language entering the “dictionary” of observability—the
18+
[OpenTelemetry open specification](/docs/specs/otel/). With the recent release
19+
of OpenTelemetry’s [Semantic Conventions](/docs/specs/semconv/), v1.27.0, you
20+
can find
21+
[designated attributes for reporting CI/CD pipelines](/docs/specs/semconv/attributes-registry/cicd/).
22+
23+
This is the result of the hard work of the
24+
[CI/CD Observability Special Interest Group (SIG) within OpenTelemetry](https://github.com/open-telemetry/community/blob/main/projects/ci-cd.md).
25+
As we accomplish this core milestone for the first phase, we thought it’d be a
26+
good time to share it with the world.
27+
28+
## Engineers need observability into their CI/CD pipelines
29+
30+
[CI/CD observability](https://medium.com/@horovits/fcc6c10c4987) is essential
31+
for ensuring that software is released to production efficiently and reliably.
32+
Well-functioning CI/CD pipelines directly impact business outcomes by shortening
33+
[Lead Time for Changes DORA metric](https://horovits.medium.com/improving-devops-performance-with-dora-metrics-918b9604f8e2)
34+
and enabling fast identification and resolution of broken or flaky processes. By
35+
integrating observability into CI/CD workflows, teams can monitor the health and
36+
performance of their pipelines in real time, gaining insights into bottlenecks
37+
and areas that require improvement.
38+
39+
Leveraging the same well-established tools used for monitoring production
40+
environments, organizations can extend their observability capabilities to
41+
include the release cycle, fostering a holistic approach to software delivery.
42+
Whether open source or proprietary tools, there’s no need to reinvent the wheel
43+
when choosing the observability toolchain for CI/CD pipelines.
44+
45+
## The need for standardization
46+
47+
However, the diverse landscape of CI/CD tools creates challenges in achieving
48+
consistent end-to-end observability. With each tool having its own means,
49+
format, and semantic conventions for reporting the pipeline execution status,
50+
fragmentation within the toolchain can hinder seamless monitoring. Migrating
51+
between tools becomes painful, as it requires reimplementing existing
52+
dashboards, reports, and alerts.
53+
54+
Things become even more challenging when you need to monitor multiple tools
55+
involved in the release pipeline in a uniform manner. This is where
56+
[open standards and specifications become critical](https://horovits.medium.com/the-rise-of-open-standards-in-observability-highlights-from-kubecon-13694e732c97).
57+
They create a common uniform language, one which is tool- and vendor-agnostic,
58+
enabling cohesive observability across different tools and allowing teams to
59+
maintain a clear and comprehensive view of their CI/CD pipeline performance.
60+
61+
The need for standardization is relevant for creating the semantic conventions
62+
mentioned above, the language for reporting what goes on in the pipeline.
63+
Standardization is also needed for the means in which this reporting is
64+
propagated through the system, such as upon spawning processes during the
65+
pipeline execution. This led us to promote standardization for using environment
66+
variables for context and baggage propagation between processes, another
67+
important milestone that was recently approved and merged.
68+
69+
## OpenTelemetry: the natural home for CI/CD observability specification
70+
71+
This realization drove us to look for the right way to approach creating a
72+
specification. OpenTelemetry emerges as the standard for telemetry generation
73+
and collection. The OpenTelemetry specification is tasked with exactly this
74+
problem: creating a common uniform and vendor-agnostic specification for
75+
telemetry. And its support from the Cloud Native Computing Foundation (CNCF)
76+
ensures it remains open and vendor-neutral. As long standing advocates of
77+
OpenTelemetry, it only made sense to extend OpenTelemetry to cover this
78+
important DevOps use case.
79+
80+
We started with an
81+
[OpenTelemetry extension proposal (OTEP #223)](https://github.com/open-telemetry/oteps/pull/223)
82+
a couple of years ago, proposing our idea to extend OpenTelemetry to cover the
83+
CI/CD observability use case. In parallel, we’ve started a Slack channel on the
84+
CNCF Slack to gather fellow enthusiasts behind the idea and start brainstorming
85+
what that should look like. The Slack channel grew and we quickly discovered
86+
that the problem is common across many organizations.
87+
88+
With the feedback from the Technical Oversight Committee and others within the
89+
CNCF, we’ve taken the path of asking the mandate to start a dedicated Working
90+
Group for the topic under OpenTelemetry’s Semantic Conventions SIG (SIG SemConv
91+
in short). With their blessing, we
92+
[launched the formal CI/CD Observability SIG](https://github.com/open-telemetry/community/blob/main/projects/ci-cd.md)
93+
to formalize our previous Slack group discussions and goals.
94+
95+
## OpenTelemetry’s CI/CD Observability SIG
96+
97+
Since November of 2023, the SIG has been actively working to develop the
98+
standard for semantics around CI/CD observability in collaboration with experts
99+
from multiple companies and open source projects. At its inception, we decided
100+
to focus on a few key areas for 2024:
101+
102+
- An initial set of common attributes across CI/CD systems.
103+
- Develop prototype(s) to include both holistic and signal-specific attributes.
104+
- Carry forward the proposal to add environment variables as context propagators
105+
to the OpenTelemetry specification (OTEP #258).
106+
- A strategy for bridging OpenTelemetry conventions with
107+
[CDEvents](https://cdevents.dev/docs/) and
108+
[Eiffel](https://eiffel-community.github.io/).
109+
110+
At first, our SIG met during the larger Semantic Conventions Working Group
111+
meetings every Monday. This provided a good opportunity for us to get our
112+
bearings as we researched and discussed how we would accomplish the goals on our
113+
roadmap. This also enabled us to get to know many members of the larger
114+
OpenTelemetry community, solicit feedback on our designs, and get direction on
115+
how to proceed. The OpenTelemetry Semantic Convention Working Group has been
116+
extraordinarily supportive of the CI/CD initiative.
117+
118+
Upon completion and release of its initial milestone (see below), our SIG was
119+
granted its own
120+
[dedicated meeting slot](https://github.com/open-telemetry/community/pull/2293)
121+
on the
122+
[OpenTelemetry calendar](https://github.com/open-telemetry/community#calendar),
123+
every Thursday at 0600 PT. The group gets together here to discuss current and
124+
future work prior to bringing to the larger Semantic Conventions meetings on
125+
Monday. We greatly look forward to the continued support and participation of
126+
the community as we continue to drive forward this critical area of
127+
standardization.
128+
129+
## CI/CD is part of the latest OpenTelemetry Semantic Conventions
130+
131+
Over the course of months of iteration and feedback, the
132+
[first set of Semantic Conventions was merged](https://github.com/open-telemetry/semantic-conventions/pull/1075)
133+
in for the v1.27.0 release. This change brought forth the first set of
134+
foundational semantics for CI/CD under the `CICD`, `artifacts`, `VCS`, `test`,
135+
and `deployment` namespaces. This was a significant milestone for the CI/CD
136+
Observability SIG and industry as a whole. This creates the foundation for which
137+
all of our group’s other goals can begin to take form, and reach implementation.
138+
139+
But what does that actually mean? What value does it provide? Let’s consider
140+
real world examples for two of the namespaces.
141+
142+
### Tracking release revisions from Version Control Systems (VCS)
143+
144+
[Version Control System (VCS) attributes](/docs/specs/semconv/attributes-registry/vcs/)
145+
cover multiple areas common in a VCS like refs and changes (pull/merge
146+
requests). The `vcs.repository.ref.revision` attribute is a key piece of
147+
metadata. As Version Control Systems like GitHub and GitLab emit events, they
148+
can now have this semantically compliant attribute. That means when integrating
149+
code, releasing it, and deploying it to environments, systems can include this
150+
attribute and trace the code revision across bounds more easily. In the event a
151+
deployment fails, you can quickly look at the revision of code and track it back
152+
to the buggy release. This attribute is actually a key piece of metadata for
153+
[DORA metrics](https://dora.dev/guides/dora-metrics-four-keys/) too, as you
154+
calculate Change lead time and Failed deployment recovery time.
155+
156+
### Artifacts for supply chain security, aligned with the SLSA specification
157+
158+
The
159+
[artifact attribute namespace](/docs/specs/semconv/attributes-registry/artifact/)
160+
had multiple attributes for its first implementation. One key set of attributes
161+
within this namespace cover [attestations](https://slsa.dev/attestation-model)
162+
that closely align with the [SLSA](https://slsa.dev/spec/v1.0/about) model. This
163+
is really the first time a direct connection is being made between observability
164+
and software supply chain security. Consider the following
165+
[supply chain threat model](https://slsa.dev/spec/v1.0/threats) defined by SLSA:
166+
{{< figure class="figure" src="SLSA-supply-chain-model.png" attr="SLSA Community Specification License 1.0" attrlink=`https://github.com/slsa-framework/slsa?tab=License-1-ov-file` >}}
167+
168+
These new attributes for artifacts and attestations help observe the sequence of
169+
events modeled in the above diagram in real time. Really, the conventions that
170+
exist today and those that will be added in the future enable interoperability
171+
between core software delivery capabilities like security and platform
172+
engineering using observability semantics.
173+
174+
## What’s next for CI/CD Observability Working Group
175+
176+
As already mentioned, the first major milestone we reached was the merge of the
177+
OTEP for extending the semantic conventions with the new attributes, which is
178+
now part of the OpenTelemetry Semantic Conventions latest release.
179+
180+
The second important milestone is
181+
[OTEP #258](https://github.com/open-telemetry/oteps/pull/258) for Environment
182+
Variable Context Propagation, which was just approved and merged. This OTEP sets
183+
the foundation for writing the specification.
184+
185+
Since we’ve made progress on our initial milestones, we’ve updated the
186+
[CI/CD Observability SIG milestones for the remainder of 2024](https://github.com/open-telemetry/community/blob/main/projects/ci-cd.md).
187+
Our goal is to finish out as many of the defined milestones as possible by the
188+
end of the year. Notably, we’re focused on:
189+
190+
- Adding
191+
[metric conventions for version control systems](https://github.com/open-telemetry/semantic-conventions/pull/1383).
192+
- Building tracing prototypes in CICD systems (for example, ArgoCD, GitHub,
193+
GitLab, Jenkins).
194+
- Getting [OTEP #258](https://github.com/open-telemetry/oteps/pull/258) ready
195+
for implementation for the addition to the specification.
196+
- Adding additional attributes to the registry covering more domains like:
197+
- [Software outage incidents](https://github.com/open-telemetry/semantic-conventions/issues/1185)
198+
- [System attributes around CI/CD runners](https://github.com/open-telemetry/semantic-conventions/issues/1184)
199+
- Beginning work on trace and event (log) signal specifics to build the bridge
200+
for interoperability between other specifications.
201+
- Adopting the changes from the
202+
[Entity and Resource OTEP](https://github.com/open-telemetry/oteps/pull/264).
203+
- [Enabling vendor-specific extension(s)](https://github.com/open-telemetry/semantic-conventions/issues/1193).
204+
- Open source community outreach strategy for semantic adoption.
205+
206+
All that has been mentioned thus far is just the beginning! We have lots of work
207+
defined on our
208+
[CICD Project Board](https://github.com/orgs/open-telemetry/projects/79), and we
209+
have work in progress! We’ll continue to iterate on the above milestones that
210+
we’ve set out for the remainder of 2024. Here’s a couple things to look out for.
211+
212+
- Version Control System metrics—leading indicators for DORA
213+
- Traces from GitHub Actions and Audit Logs
214+
- Special thanks to the following people who are making this component
215+
possible:
216+
- Tyler Helmuth – Honeycomb
217+
- Andrzej Stencel – Elastic
218+
- Curtis Robert – Splunk
219+
- Justin Voss
220+
- Kristof Kowalski – Anz Bank
221+
- Mike Sarahan – Nvidia
222+
- A corresponding version of the GitHub Receiver Component but implemented in
223+
GitLab
224+
225+
And much more!
226+
227+
## It takes a village to extend OpenTelemetry
228+
229+
Whoa, that’s a lot to do! Most certainly this SIG will continue beyond 2024 and
230+
through 2025. Standards are hard, but essential. And, we have some amazing folks
231+
that are part of the SIG and contributing to these standards! Who you may ask?
232+
233+
Firstly we’d like to acknowledge key members of OpenTelemetry leadership
234+
committees who have heavily enabled the work we’ve done thus far, and will
235+
continue to do.
236+
237+
From the OpenTelemetry Technical Committee we have two core sponsors, Carlos
238+
Alberto from Lightstep and Josh Suereth from Google. Both Carlos and Josh have
239+
been so supportive of the CICD work, really guiding us through the process and
240+
details we need to be successful.
241+
242+
From the OpenTelemetry Governance Committee we’ve had Trask Stalnaker from
243+
Microsoft act as an exceptional ally, and Daniel Blanco from Skyscanner who now
244+
acts as our current Liaison. Both Trask and Daniel have been instrumental in
245+
supporting the SIG and enabling us to have our own meeting in the OpenTelemetry
246+
community.
247+
248+
In addition to those folks, we’ve had significant feedback, support, and
249+
contributions from the following key folks:
250+
251+
- Yuri Shkuro – Creator of Jaeger, Co-Founder of OpenTelemetry
252+
- Andrea Frittoli – Tekton CD Maintainer, CDEvents Co-creator, IBM
253+
- Emil Bäckmark – CDEvents and Eiffel Maintainer, Ericsson
254+
- Magnus Bäck – Eiffel, Axis Communications
255+
- Liudmila Molkova – Microsoft
256+
- Christopher Kamphaus – Jemmic, Jenkins
257+
- Giordano Ricci – Grafana Labs
258+
- Giovanni Liva – Dynatrace, Keptn
259+
- Ivan Calvo – Elastic, Jenkins
260+
- Armin Ruech – Dynatrace
261+
- Michael Safyan – Google
262+
- Robb Kidd – Honeycomb
263+
- Pablo Chacin – Grafana Labs
264+
- Alexandra Konrad – Elastic
265+
- Alexander Wert – Elastic
266+
- Joao Grassi – Dynatrace
267+
- DJ Gregor – Discover
268+
269+
That was a lot of names to name! We greatly appreciate everyone who has
270+
supported this initiative and helped bring it to fruition! It takes significant
271+
thinking ability and time to build industry wide standards. Hard problems are
272+
hard, but these folks have risen to the challenge to make the world of
273+
observability and CICD systems a better, more interoperable place!
274+
275+
## Join the Working Group discourse and make an impact
276+
277+
Want to learn more? Want to get involved in shaping CI/CD Observability?
278+
279+
We invite developers and practitioners to participate in the discussions,
280+
contribute ideas, and help shape the future of CI/CD observability and the
281+
OpenTelemetry semantic conventions. Discussion takes place in the
282+
[CNCF Slack](https://slack.cncf.io/) workspace under the `#cicd-o11y` channel,
283+
and you can chime in on any of the GitHub issues mentioned throughout this
284+
article and join the CICD SIG
285+
[weekly calls](https://calendar.google.com/calendar?cid=Z29vZ2xlLmNvbV9iNzllM2U5MGo3YmJzYTJuMnA1YW41bGY2MEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t)
286+
every Thursday at 0600 PT.
287+
288+
_A version of this article also [appears on the CNCF blog][]._
289+
290+
[appears on the CNCF blog]: <{{% param canonical_url %}}>

0 commit comments

Comments
 (0)