Skip to content

Commit e050670

Browse files
authored
Remove telegraf and use only fluent-bit for telemetry (#1030)
[comment]: # (Note that your PR title should follow the conventional commit format: https://conventionalcommits.org/en/v1.0.0/#summary) # PR Description - Upgrade `fluent-bit` - Linux: `2.1.10` -> `3.2.2` (latest) - `>=3.2` is necessary for using `metrics_selector` and `labels` processors for filtering Prometheus metrics - Windows: `2.1.10` -> `3.0.7` (latest) - `3.0` is the latest Windows version - Remove `telegraf` for Linux and Windows - Changes to `fluent-bit` - Built-in Plugins: - Use `prometheus_scrape` input plugin to scrape Prometheus metrics previously collected by telegraf - Use `metrics_selector` and `labels` processors to filter certain metrics and drop unnecessary labels before sending to App Insights - Conf: - Use new YAML format for config to be able to use `metrics_selector` and `labels` processors - Custom Output Plugin: 1. Collect CPU and Memory usage for otelcollector and metricsextension that were previously collected by telegraf - This runs as a go routine, separate from the fluent-bit pipeline - Use the same underlying golang package as telegraf: `github.com/shirou/gopsutil/v4/process` - Collect at the same frequency as telegraf and aggregate to p50 and p95 - Send extra env var as customDimensions as telegraf did 3. Decode the Prometheus metrics msgpack from fluent-bit and send to App Insights in the format we want - Add one line to the fluent-bit `proxy_plugin` file so that the Prometheus metrics will be allowed to flow to our golang output plugin: - `out->event_type = FLB_OUTPUT_LOGS | FLB_OUTPUT_METRICS;` - `fluent-bit` has the `proxy_plugin` files to allow the golang output plugins to be built upon the C code. However, this does not specify what type the output plugin accepts (out of `logs`, `metrics`, and `traces` types), so it defaults to only allowing the `logs` type to be routed to the ouput plugin. - Build Pipeline: - Build `fluent-bit` with the line added above in the exact same way Mariner builds the package. - Only build `fluent-bit` with the plugins that we actually use so that our CVE surface area is very low. - Main image bug fixes: - Use daemonset config file for fluent-bit for the daemonset. Previously, it was using the replicaset config file for both replicaset and daemonset and the daemonset logs weren't being collected - Fix telemetry sent for `network-observability` and `acstor` that was missed [comment]: # (The below checklist is for PRs adding new features. If a box is not checked, add a reason why it's not needed.) # New Feature Checklist - [x] Link to the one-pager about the feature: https://msazure.visualstudio.com/InfrastructureInsights/_wiki/wikis/InfrastructureInsights.wiki/741581/TelegrafRemoval - [x] Attach results of scale and perf testing: <img width="1753" alt="image" src="https://github.com/user-attachments/assets/de8a4d9a-6f0f-40ba-b55d-491dc68e503b" /> # Telemetry Values Comparison - ReplicaSet <img width="1712" alt="image" src="https://github.com/user-attachments/assets/4ab3d9b3-51ab-4775-a5a6-872d3d0291c9" /> - DaemonSet <img width="1706" alt="image" src="https://github.com/user-attachments/assets/cd2feee3-1bfd-411c-9050-99a3681f00c8" /> - All extra [env var telemetry are transferred over](https://dataexplorer.azure.com/dashboards/94da59c1-df12-4134-96bb-82c6b32e6199?p-_cluster=v-%2Fsubscriptions%2Fb9842c7c-1a38-4385-8f39-a51314758bcf%2FresourceGroups%2Fgrace-win%2Fproviders%2FMicrosoft.ContainerService%2FmanagedClusters%2Fgrace-win&p-_startTime=7days&p-_endTime=now&p-Interval=v-5m&p-AKSClusterID=v-675b8ceeceb9d100010e6fe2#9d5aa5eb-cc6f-46c9-81f6-22c8fe5357e4)
1 parent 902617e commit e050670

27 files changed

+1304
-1348
lines changed

.pipelines/azure-pipeline-build.yml

+158-34
Large diffs are not rendered by default.

.trivyignore

+19-39
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,25 @@
11
# CRITICAL
2-
# telegraf
3-
CVE-2024-41110
42
# kube-state-metrics
5-
CVE-2024-24790
3+
CVE-2024-45337 # golang.org/x/crypto
4+
CVE-2024-24790 # stdlib
65
# HIGH
7-
# telegraf
8-
GHSA-7jwh-3vrq-q3m8
9-
CVE-2024-27289
10-
CVE-2024-27304
6+
# otelcollector
7+
CVE-2024-45338 # golang.org/x/net
8+
# promconfigvalidator
9+
CVE-2024-45338 # golang.org/x/net
10+
# configurationreader
11+
CVE-2024-45338 # golang.org/x/net
12+
# targetallocator
13+
CVE-2024-45338 # golang.org/x/net
1114
# kube-state-metrics
12-
CVE-2024-34156
13-
# node-exporter
14-
CVE-2023-39325
15-
CVE-2023-29403
16-
CVE-2023-39325
17-
CVE-2023-45283
15+
CVE-2024-45338 # golang.org/x/net
16+
CVE-2024-34156 # stdlib
1817
# MEDIUM
19-
# telegraf
20-
CVE-2024-35255
21-
CVE-2024-28110
22-
CVE-2024-24557
23-
CVE-2024-29018
24-
CVE-2023-45288
25-
CVE-2024-24786
2618
# kube-state-metrics
27-
CVE-2024-24789
28-
CVE-2024-24791
29-
CVE-2024-34155
30-
CVE-2024-34158
31-
# node-exporter
32-
CVE-2023-48795
33-
CVE-2023-3978
34-
CVE-2023-44487
35-
CVE-2023-29406
36-
CVE-2023-29409
37-
CVE-2023-39318
38-
CVE-2023-39319
39-
CVE-2023-39326
40-
CVE-2023-45284
41-
CVE-2023-45289
42-
CVE-2023-45290
43-
CVE-2024-24783
44-
CVE-2024-24784
45-
CVE-2024-24785
19+
CVE-2023-45288 # stdlib
20+
CVE-2024-24789 # stdlib
21+
CVE-2024-24791 # stdlib
22+
CVE-2024-34155 # stdlib
23+
CVE-2024-34158 # stdlib
24+
CVE-2024-45336 # stdlib
25+
CVE-2024-45341 # stdlib

internal/referenceapp/golang/linux/Dockerfile

+3-4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11

2-
FROM mcr.microsoft.com/cbl-mariner/base/core:2.0 as builder
2+
ARG GOLANG_VERSION
3+
FROM mcr.microsoft.com/oss/go/microsoft/golang:${GOLANG_VERSION} as builder
34

45
# Set necessary environmet variables needed for our image
56
ENV GO111MODULE=on \
@@ -10,12 +11,10 @@ ENV GO111MODULE=on \
1011
# Move to working directory /build
1112
WORKDIR /build
1213

13-
ARG GOLANG_VERSION
14-
1514
# Copy and download dependency using go mod
1615
COPY go.mod .
1716
COPY go.sum .
18-
RUN tdnf install -y golang-${GOLANG_VERSION} ca-certificates
17+
#RUN tdnf install -y golang-${GOLANG_VERSION} ca-certificates
1918
RUN go mod download
2019

2120
# COPY client-cert.pem /etc/prometheus/certs/

otelcollector/build/linux/Dockerfile

+43-8
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,39 @@ RUN apt-get update && apt-get install gcc-aarch64-linux-gnu -y
6868
ARG TARGETOS TARGETARCH
6969
RUN if [ "$TARGETARCH" = "arm64" ] ; then CC=aarch64-linux-gnu-gcc CGO_ENABLED=1 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -buildmode=exe -ldflags '-linkmode external -extldflags=-Wl,-z,now' -o main.exe ./main.go ; else CGO_ENABLED=1 GOOS=$TARGETOS GOARCH=$TARGETARCH go build -buildmode=exe -ldflags '-linkmode external -extldflags=-Wl,-z,now' -o main.exe ./main.go ; fi
7070

71+
ARG TARGETARCH
72+
FROM mcr.microsoft.com/cbl-mariner/base/core:2.0 AS fluent-bit-binary-builder
73+
WORKDIR /
74+
# Install with the same exact dependencies and code source that Mariner uses
75+
RUN tdnf install wget tar ca-certificates bison cmake cyrus-sasl-devel doxygen flex gcc-c++ \
76+
gnutls-devel graphviz libpq-devel libyaml-devel luajit-devel make openssl-devel pkgconfig \
77+
systemd-devel systemd-rpm-macros zlib-devel build-essential -y
78+
ARG FLUENT_BIT_VERSION
79+
RUN wget https://github.com/fluent/fluent-bit/archive/refs/tags/v${FLUENT_BIT_VERSION}.tar.gz
80+
RUN tar -xvf v${FLUENT_BIT_VERSION}.tar.gz
81+
# Add a file with settings to build only the plugins we use
82+
COPY ./fluent-bit/plugins_options.cmake /fluent-bit-${FLUENT_BIT_VERSION}/cmake/plugins_options.cmake
83+
# Make a change that allows Fluent-Bit metrics to flow to our Go output plugin
84+
RUN sed -i '/out->type = FLB_OUTPUT_PLUGIN_PROXY;/a \ \ \ \ out->event_type = FLB_OUTPUT_LOGS | FLB_OUTPUT_METRICS;' /fluent-bit-${FLUENT_BIT_VERSION}/src/flb_plugin_proxy.c
85+
WORKDIR /fluent-bit-${FLUENT_BIT_VERSION}/build
86+
# Run cmake with the same flags that Mariner uses
87+
RUN cmake \
88+
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
89+
-DFLB_EXAMPLES=Off \
90+
-DFLB_OUT_SLACK=Off \
91+
-DFLB_IN_SYSTEMD=On \
92+
-DFLB_OUT_TD=Off \
93+
-DFLB_OUT_ES=Off \
94+
-DFLB_SHARED_LIB=On \
95+
-DFLB_RELEASE=On \
96+
-DFLB_DEBUG=Off \
97+
-DFLB_TLS=On \
98+
-DFLB_JEMALLOC=On \
99+
-DFLB_PREFER_SYSTEM_LIBS=On \
100+
-DFLB_PROXY_GO=On ../
101+
RUN make
102+
RUN make install
103+
71104
FROM mcr.microsoft.com/cbl-mariner/base/core:2.0 as builder
72105
LABEL description="Azure Monitor Prometheus metrics collector"
73106
LABEL maintainer="[email protected]"
@@ -103,9 +136,11 @@ COPY --from=main-builder --chmod=777 /main/main.exe $tmpdir/main
103136

104137
COPY ./scripts/*.sh $tmpdir/
105138
COPY ./metricextension/me.config ./metricextension/me_internal.config ./metricextension/me_ds.config ./metricextension/me_ds_internal.config /usr/sbin/
106-
COPY ./telegraf/ $tmpdir/telegraf/
107-
COPY ./fluent-bit/fluent-bit.conf ./fluent-bit/fluent-bit-daemonset.conf ./fluent-bit/fluent-bit-parsers.conf $tmpdir/fluent-bit/
139+
COPY ./fluent-bit/fluent-bit.yaml ./fluent-bit/fluent-bit-daemonset.yaml ./fluent-bit/fluent-bit-parsers.conf $tmpdir/fluent-bit/
108140
COPY --from=fluent-bit-builder /src/out_appinsights.so $tmpdir/fluent-bit/bin/
141+
COPY --from=fluent-bit-binary-builder /usr/local/bin/fluent-bit /usr/local/bin/fluent-bit
142+
COPY --from=fluent-bit-binary-builder /usr/local/etc/fluent-bit /usr/local/etc/fluent-bit
143+
COPY --from=fluent-bit-binary-builder /usr/local/lib/fluent-bit /usr/local/etc/fluent-bit
109144
COPY ./react /static/react
110145
COPY ./LICENSE $tmpdir/microsoft
111146
COPY ./NOTICE $tmpdir/microsoft
@@ -123,7 +158,7 @@ RUN chmod 775 $tmpdir/*.sh;
123158
RUN sync;
124159
RUN $tmpdir/setup.sh ${TARGETARCH}
125160
# If wanting to run without distroless, uncomment this line and comment everything after
126-
# CMD [ "/opt/main.sh" ]
161+
# ENTRYPOINT ["./opt/main"]
127162

128163
FROM mcr.microsoft.com/cbl-mariner/distroless/base:2.0
129164
# Below is for ContainerInsightsPrometheusCollector-Prod AppInsights Resource
@@ -169,8 +204,6 @@ COPY --from=builder /usr/sbin/MetricsExtension /usr/sbin/MetricsExtension
169204
COPY --from=builder /usr/bin/inotifywait /usr/bin/inotifywait
170205
COPY --from=builder /usr/bin/bash /usr/bin/bash
171206
COPY --from=builder /usr/sbin/busybox /usr/sbin/busybox
172-
COPY --from=builder /usr/bin/fluent-bit /usr/bin/fluent-bit
173-
COPY --from=builder /usr/bin/telegraf /usr/bin/telegraf
174207
COPY --from=builder /usr/sbin/crond /usr/sbin/crond
175208
COPY --from=builder /usr/bin/vim /usr/bin/vim
176209
COPY --from=builder /usr/share/vim /usr/share/vim
@@ -183,6 +216,10 @@ COPY --from=builder /bin/sh /bin/sh
183216
COPY --from=builder /usr/bin/p11-kit /usr/bin
184217
COPY --from=builder /usr/bin/trust /usr/bin
185218

219+
COPY --from=fluent-bit-binary-builder /usr/local/bin/fluent-bit /usr/local/bin/fluent-bit
220+
COPY --from=fluent-bit-binary-builder /usr/local/etc/fluent-bit /usr/local/etc/fluent-bit
221+
COPY --from=fluent-bit-binary-builder /usr/local/lib/fluent-bit /usr/local/etc/fluent-bit
222+
186223
# bash dependencies
187224
COPY --from=builder /lib/libreadline.so.8 /lib/
188225
COPY --from=builder /usr/lib/libncursesw.so.6 /usr/lib/libtinfo.so.6 /usr/lib/
@@ -198,9 +235,7 @@ COPY --from=builder /lib/libboost_filesystem.so.1.76.0 /lib/libcpprest.so.2.10
198235
COPY --from=builder /lib64/libuuid.so.1 /lib64
199236
# fluent-bit dependencies
200237
# libssl.so.1.1 & libcrypto.so.1.1 are already available with openssl in distroless and copying them over causes FIPS HMAC verification failures
201-
COPY --from=builder /lib/libyaml-0.so.2 /lib/libsystemd.so.0 /lib/libcurl.so.4 /lib/libm.so.6 /lib/libz.so.1 /lib/libzstd.so.1 /lib/libsasl2.so.3 /lib/libgcc_s.so.1 /lib/libc.so.6 /lib/liblzma.so.5 /lib/liblz4.so.1 /lib/libcap.so.2 /lib/libgcrypt.so.20 /lib/libnghttp2.so.14 /lib/libssh2.so.1 /lib/libgssapi_krb5.so.2 /lib/libresolv.so.2 /lib/libgpg-error.so.0 /usr/lib/libkrb5.so.3 /usr/lib/libk5crypto.so.3 /usr/lib/libcom_err.so.2 /usr/lib/libkrb5support.so.0 /lib/
202-
# telegraf dependencies
203-
COPY --from=builder /lib/libc.so.6 /lib/
238+
COPY --from=fluent-bit-binary-builder /lib/libluajit-5.1.so.2 /lib/libyaml-0.so.2 /lib/libsystemd.so.0 /lib/libgcc_s.so.1 /lib/libc.so.6 /lib/liblzma.so.5 /lib/libzstd.so.1 /lib/liblz4.so.1 /lib/libcap.so.2 /lib/libgcrypt.so.20 /lib/libgpg-error.so.0 /lib/
204239
# mdsd dependencies
205240
COPY --from=builder /usr/lib/libdl.so.2 /usr/lib/librt.so.1 /usr/lib/libpthread.so.0 /usr/lib/libm.so.6 /usr/lib/libstdc++.so.6 /usr/lib/libgcc_s.so.1 /usr/lib/
206241
# logrotate dependencies

otelcollector/build/windows/Dockerfile

-1
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ COPY ./configmapparser/default-prom-configs/*.yml $tmpdir/microsoft/otelcollecto
3030
COPY ./opentelemetry-collector-builder/otelcollector.exe ./opentelemetry-collector-builder/collector-config-default.yml ./opentelemetry-collector-builder/collector-config-template.yml $tmpdir/microsoft/otelcollector/
3131
COPY ./prom-config-validator-builder/promconfigvalidator.exe $tmpdir/
3232
COPY ./metricextension/me.config ./metricextension/me_internal.config ./metricextension/me_ds.config ./metricextension/me_ds_win.config ./metricextension/me_ds_internal.config ./metricextension/me_ds_internal_win.config $tmpdir/metricextension/
33-
COPY ./telegraf/telegraf-prometheus-collector-windows.conf $tmpdir/telegraf/
3433
COPY ./fluent-bit/fluent-bit-windows.conf $tmpdir/fluent-bit/
3534
COPY ./fluent-bit/fluent-bit-parsers.conf $tmpdir/fluent-bit/
3635
COPY ./fluent-bit/src/out_appinsights.so $tmpdir/fluent-bit/bin/

otelcollector/build/windows/scripts/setup.ps1

+1-1
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Write-Host ('Installing Fluent Bit');
3232
try {
3333
# Keep version in sync with linux in setup.sh file
3434
# $fluentBitUri = 'https://github.com/microsoft/OMS-docker/releases/download/winakslogagent/td-agent-bit-1.4.0-win64.zip'
35-
$fluentBitUri = 'https://releases.fluentbit.io/2.1/fluent-bit-2.1.10-win64.zip'
35+
$fluentBitUri = 'https://releases.fluentbit.io/3.0/fluent-bit-3.0.7-win64.zip'
3636
Invoke-WebRequest -Uri $fluentBitUri -OutFile /installation/fluent-bit.zip
3737
Expand-Archive -Path /installation/fluent-bit.zip -Destination /installation/fluent-bit
3838
Move-Item -Path /installation/fluent-bit/*/bin/* -Destination /opt/fluent-bit/bin/ -ErrorAction SilentlyContinue

0 commit comments

Comments
 (0)