[profiling] Reduce copying and allocation in exporter #926

danielsn · 2025-03-13T20:59:59Z

What does this PR do?

Passes the EncodedProfile as a Rust object, rather than forcing it through the C-FFI straw.

Motivation

We ended up having to make a new Vec and copy the bytes from the pprof in, even though they were already there. This saves the allocation and copy (which can be several MB).

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Existing tests

pr-commenter · 2025-03-13T21:10:16Z

Benchmarks

Comparison

Benchmark execution time: 2025-03-14 21:26:48

Comparing candidate commit f85951f in PR branch dsn/exporter_avoid_copy with baseline commit b39c6ee in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 51 metrics, 2 unstable metrics.

scenario:redis/obfuscate_redis_string

🟥 execution_time [+5.047µs; +5.563µs] or [+15.125%; +16.671%]

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
ip_address/quantize_peer_ip_address_benchmark	execution_time	4.914µs	4.991µs ± 0.041µs	4.979µs ± 0.031µs	5.023µs	5.060µs	5.063µs	5.066µs	1.76%	0.361	-1.163	0.81%	0.003µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark	execution_time	[4.985µs; 4.997µs] or [-0.113%; +0.113%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	54.418ms	54.626ms ± 0.202ms	54.584ms ± 0.063ms	54.647ms	54.980ms	55.398ms	56.240ms	3.03%	4.088	23.746	0.37%	0.014ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[54.598ms; 54.654ms] or [-0.051%; +0.051%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	17.703µs	26.105µs ± 10.995µs	18.061µs ± 0.298µs	34.847µs	44.856µs	46.611µs	90.947µs	403.55%	1.681	5.461	42.01%	0.777µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[24.581µs; 27.629µs] or [-5.837%; +5.837%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	3.897µs	3.914µs ± 0.004µs	3.914µs ± 0.001µs	3.915µs	3.918µs	3.919µs	3.948µs	0.87%	5.091	43.911	0.10%	0.000µs	1	200
credit_card/is_card_number/	throughput	253305407.634op/s	255492424.699op/s ± 260748.388op/s	255513138.319op/s ± 83960.657op/s	255595616.536op/s	255700207.676op/s	255858742.550op/s	256607878.765op/s	0.43%	-5.030	43.424	0.10%	18437.695op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	81.677µs	83.020µs ± 0.432µs	83.043µs ± 0.308µs	83.334µs	83.716µs	83.817µs	83.934µs	1.07%	-0.284	-0.198	0.52%	0.031µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	11914132.386op/s	12045609.723op/s ± 62800.446op/s	12041890.676op/s ± 44668.419op/s	12088704.316op/s	12143457.886op/s	12199167.814op/s	12243337.651op/s	1.67%	0.311	-0.164	0.52%	4440.662op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	78.523µs	79.496µs ± 0.393µs	79.443µs ± 0.255µs	79.754µs	80.128µs	80.498µs	80.675µs	1.55%	0.308	-0.072	0.49%	0.028µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	12395401.503op/s	12579569.899op/s ± 62023.246op/s	12587637.685op/s ± 40290.117op/s	12624894.714op/s	12674507.861op/s	12715754.224op/s	12735041.102op/s	1.17%	-0.281	-0.092	0.49%	4385.706op/s	1	200
credit_card/is_card_number/37828224631	execution_time	3.896µs	3.914µs ± 0.003µs	3.914µs ± 0.001µs	3.915µs	3.918µs	3.920µs	3.921µs	0.18%	-1.271	7.382	0.07%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	255049327.838op/s	255501095.976op/s ± 185681.949op/s	255508444.647op/s ± 80934.974op/s	255565699.557op/s	255824843.829op/s	255960373.402op/s	256680421.924op/s	0.46%	1.288	7.491	0.07%	13129.696op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	75.368µs	76.557µs ± 0.432µs	76.564µs ± 0.326µs	76.869µs	77.231µs	77.344µs	77.486µs	1.20%	-0.277	-0.352	0.56%	0.031µs	1	200
credit_card/is_card_number/378282246310005	throughput	12905568.576op/s	13062507.074op/s ± 73810.616op/s	13061043.496op/s ± 55612.892op/s	13118247.491op/s	13194095.342op/s	13243937.606op/s	13268295.054op/s	1.59%	0.303	-0.326	0.56%	5219.199op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	51.337µs	51.436µs ± 0.037µs	51.435µs ± 0.022µs	51.457µs	51.511µs	51.548µs	51.557µs	0.24%	0.522	1.037	0.07%	0.003µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	19396087.401op/s	19441482.863op/s ± 14102.447op/s	19441903.703op/s ± 8480.490op/s	19450891.848op/s	19461385.897op/s	19472039.284op/s	19479028.878op/s	0.19%	-0.516	1.029	0.07%	997.194op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	6.026µs	6.038µs ± 0.006µs	6.038µs ± 0.003µs	6.041µs	6.046µs	6.049µs	6.098µs	0.99%	4.137	39.085	0.10%	0.000µs	1	200
credit_card/is_card_number/x371413321323331	throughput	163978238.820op/s	165604649.786op/s ± 172219.853op/s	165605928.693op/s ± 75097.614op/s	165680790.096op/s	165879988.022op/s	165924761.050op/s	165939196.311op/s	0.20%	-4.063	38.211	0.10%	12177.783op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	3.893µs	3.914µs ± 0.003µs	3.914µs ± 0.001µs	3.915µs	3.919µs	3.920µs	3.922µs	0.22%	-1.085	8.625	0.08%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	254953666.959op/s	255518685.533op/s ± 202574.991op/s	255521491.022op/s ± 96491.703op/s	255623185.028op/s	255747801.883op/s	255987246.065op/s	256847069.577op/s	0.52%	1.108	8.769	0.08%	14324.215op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	66.027µs	66.328µs ± 0.129µs	66.313µs ± 0.090µs	66.418µs	66.545µs	66.652µs	66.680µs	0.55%	0.318	-0.187	0.19%	0.009µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	14997094.437op/s	15076685.822op/s ± 29382.090op/s	15080061.896op/s ± 20352.638op/s	15097878.371op/s	15121341.134op/s	15135484.295op/s	15145228.195op/s	0.43%	-0.308	-0.196	0.19%	2077.628op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	59.399µs	59.571µs ± 0.070µs	59.562µs ± 0.044µs	59.610µs	59.697µs	59.784µs	59.848µs	0.48%	0.781	1.415	0.12%	0.005µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	16708951.254op/s	16786713.860op/s ± 19665.883op/s	16789128.200op/s ± 12370.565op/s	16800631.704op/s	16812601.974op/s	16825131.787op/s	16835345.809op/s	0.28%	-0.771	1.391	0.12%	1390.588op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	3.892µs	3.914µs ± 0.003µs	3.913µs ± 0.002µs	3.915µs	3.919µs	3.920µs	3.931µs	0.46%	-0.366	11.726	0.08%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	254373512.671op/s	255508371.755op/s ± 213920.685op/s	255532922.608op/s ± 99771.699op/s	255617228.601op/s	255757381.632op/s	255916940.591op/s	256912282.246op/s	0.54%	0.401	11.832	0.08%	15126.477op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	56.191µs	56.465µs ± 0.132µs	56.456µs ± 0.079µs	56.529µs	56.727µs	56.843µs	56.868µs	0.73%	0.612	0.307	0.23%	0.009µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	17584439.391op/s	17710181.154op/s ± 41231.821op/s	17712802.530op/s ± 24706.911op/s	17738992.060op/s	17767595.909op/s	17783850.804op/s	17796373.099op/s	0.47%	-0.598	0.283	0.23%	2915.530op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	51.350µs	51.437µs ± 0.031µs	51.435µs ± 0.016µs	51.451µs	51.484µs	51.510µs	51.669µs	0.46%	2.174	15.384	0.06%	0.002µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	19353895.507op/s	19441449.132op/s ± 11637.633op/s	19442063.623op/s ± 6172.951op/s	19448208.449op/s	19455305.523op/s	19469944.889op/s	19474227.807op/s	0.17%	-2.152	15.190	0.06%	822.905op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	6.027µs	6.038µs ± 0.004µs	6.038µs ± 0.002µs	6.040µs	6.045µs	6.047µs	6.051µs	0.21%	-0.020	0.267	0.07%	0.000µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	165270500.204op/s	165619331.760op/s ± 111641.387op/s	165619236.674op/s ± 65228.631op/s	165684747.748op/s	165820279.824op/s	165863287.632op/s	165908340.651op/s	0.17%	0.025	0.266	0.07%	7894.238op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[3.913µs; 3.915µs] or [-0.014%; +0.014%]	None	None	None
credit_card/is_card_number/	throughput	[255456287.480op/s; 255528561.917op/s] or [-0.014%; +0.014%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[82.960µs; 83.080µs] or [-0.072%; +0.072%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[12036906.186op/s; 12054313.261op/s] or [-0.072%; +0.072%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[79.441µs; 79.550µs] or [-0.068%; +0.068%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[12570974.073op/s; 12588165.724op/s] or [-0.068%; +0.068%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[3.913µs; 3.914µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[255475362.244op/s; 255526829.708op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[76.498µs; 76.617µs] or [-0.078%; +0.078%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[13052277.633op/s; 13072736.516op/s] or [-0.078%; +0.078%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[51.431µs; 51.442µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[19439528.400op/s; 19443437.327op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[6.038µs; 6.039µs] or [-0.014%; +0.014%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[165580781.771op/s; 165628517.802op/s] or [-0.014%; +0.014%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[3.913µs; 3.914µs] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[255490610.588op/s; 255546760.479op/s] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[66.310µs; 66.346µs] or [-0.027%; +0.027%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[15072613.747op/s; 15080757.897op/s] or [-0.027%; +0.027%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[59.561µs; 59.581µs] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[16783988.358op/s; 16789439.362op/s] or [-0.016%; +0.016%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[3.913µs; 3.914µs] or [-0.012%; +0.012%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[255478724.406op/s; 255538019.105op/s] or [-0.012%; +0.012%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[56.447µs; 56.483µs] or [-0.032%; +0.032%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[17704466.820op/s; 17715895.488op/s] or [-0.032%; +0.032%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[51.432µs; 51.441µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[19439836.268op/s; 19443061.996op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[6.037µs; 6.039µs] or [-0.009%; +0.009%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[165603859.338op/s; 165634804.183op/s] or [-0.009%; +0.009%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	246.127ns	253.037ns ± 9.518ns	249.356ns ± 1.574ns	251.902ns	277.392ns	283.119ns	285.315ns	14.42%	2.172	3.289	3.75%	0.673ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[251.718ns; 254.356ns] or [-0.521%; +0.521%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	148.744µs	149.641µs ± 0.427µs	149.564µs ± 0.178µs	149.788µs	150.311µs	151.276µs	152.812µs	2.17%	2.963	16.894	0.28%	0.030µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[149.582µs; 149.700µs] or [-0.040%; +0.040%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	37.961µs	38.672µs ± 1.168µs	38.133µs ± 0.066µs	38.268µs	41.233µs	41.272µs	41.986µs	10.10%	1.707	0.975	3.01%	0.083µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[38.510µs; 38.834µs] or [-0.419%; +0.419%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	504.741µs	505.804µs ± 0.591µs	505.774µs ± 0.226µs	505.988µs	506.404µs	506.618µs	512.424µs	1.31%	7.026	76.886	0.12%	0.042µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1951510.599op/s	1977054.096op/s ± 2291.575op/s	1977165.930op/s ± 883.047op/s	1978146.373op/s	1979253.280op/s	1980289.778op/s	1981215.654op/s	0.20%	-6.923	75.387	0.12%	162.039op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	452.275µs	453.256µs ± 0.417µs	453.270µs ± 0.312µs	453.548µs	453.928µs	454.327µs	454.427µs	0.26%	0.182	-0.196	0.09%	0.029µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2200573.712op/s	2206262.881op/s ± 2029.601op/s	2206191.058op/s ± 1520.768op/s	2207761.629op/s	2209491.171op/s	2210391.266op/s	2211045.388op/s	0.22%	-0.177	-0.200	0.09%	143.514op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	175.645µs	176.639µs ± 0.351µs	176.629µs ± 0.218µs	176.854µs	177.239µs	177.350µs	177.380µs	0.43%	-0.123	-0.133	0.20%	0.025µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5637610.286op/s	5661295.396op/s ± 11248.009op/s	5661581.325op/s ± 6999.433op/s	5668399.331op/s	5680023.579op/s	5688703.462op/s	5693308.155op/s	0.56%	0.134	-0.124	0.20%	795.354op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	37.534µs	37.634µs ± 0.043µs	37.632µs ± 0.026µs	37.660µs	37.706µs	37.744µs	37.771µs	0.37%	0.291	0.376	0.11%	0.003µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	26475260.793op/s	26571661.763op/s ± 30110.329op/s	26573160.044op/s ± 18285.812op/s	26590349.275op/s	26617366.569op/s	26640081.394op/s	26642803.802op/s	0.26%	-0.284	0.370	0.11%	2129.122op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	48.083µs	48.264µs ± 0.239µs	48.105µs ± 0.014µs	48.531µs	48.641µs	48.708µs	49.503µs	2.91%	1.298	2.028	0.49%	0.017µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	20200661.383op/s	20719891.151op/s ± 101756.147op/s	20787753.044op/s ± 5838.513op/s	20792068.123op/s	20794711.014op/s	20795766.647op/s	20797266.645op/s	0.05%	-1.264	1.782	0.49%	7195.246op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[505.722µs; 505.886µs] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1976736.506op/s; 1977371.686op/s] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[453.198µs; 453.313µs] or [-0.013%; +0.013%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2205981.598op/s; 2206544.164op/s] or [-0.013%; +0.013%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[176.590µs; 176.687µs] or [-0.028%; +0.028%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5659736.530op/s; 5662854.262op/s] or [-0.028%; +0.028%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[37.628µs; 37.640µs] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[26567488.761op/s; 26575834.765op/s] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[48.231µs; 48.297µs] or [-0.068%; +0.068%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[20705788.728op/s; 20733993.574op/s] or [-0.068%; +0.068%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.330µs	2.390µs ± 0.022µs	2.383µs ± 0.012µs	2.409µs	2.431µs	2.436µs	2.438µs	2.33%	-0.193	0.179	0.93%	0.002µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.387µs; 2.393µs] or [-0.130%; +0.130%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
concentrator/add_spans_to_concentrator	execution_time	6.095ms	6.106ms ± 0.011ms	6.105ms ± 0.003ms	6.109ms	6.114ms	6.119ms	6.226ms	1.97%	7.934	82.447	0.17%	0.001ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
concentrator/add_spans_to_concentrator	execution_time	[6.105ms; 6.108ms] or [-0.024%; +0.024%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.178µs	3.182µs ± 1.416µs	2.993µs ± 0.030µs	3.014µs	3.624µs	13.872µs	14.834µs	395.70%	7.386	55.549	44.39%	0.100µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[2.986µs; 3.378µs] or [-6.167%; +6.167%]	None	None	None

Group 12

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	208.583µs	209.038µs ± 0.211µs	209.022µs ± 0.119µs	209.148µs	209.354µs	209.525µs	210.689µs	0.80%	2.451	17.416	0.10%	0.015µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	4746332.898op/s	4783829.662op/s ± 4819.299op/s	4784180.628op/s ± 2720.814op/s	4786626.962op/s	4790060.879op/s	4793073.600op/s	4794256.481op/s	0.21%	-2.411	17.033	0.10%	340.776op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	18.233µs	18.322µs ± 0.042µs	18.313µs ± 0.023µs	18.347µs	18.392µs	18.423µs	18.566µs	1.38%	1.230	4.912	0.23%	0.003µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	53860947.328op/s	54580630.867op/s ± 124270.401op/s	54606243.114op/s ± 68944.921op/s	54654320.608op/s	54749431.520op/s	54840349.585op/s	54845533.776op/s	0.44%	-1.193	4.691	0.23%	8787.244op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	10.689µs	10.750µs ± 0.035µs	10.750µs ± 0.019µs	10.769µs	10.804µs	10.826µs	10.951µs	1.87%	1.044	4.633	0.32%	0.002µs	1	200
normalization/normalize_name/normalize_name/good	throughput	91316246.425op/s	93021537.773op/s ± 300813.650op/s	93024720.248op/s ± 160751.504op/s	93181898.428op/s	93482910.812op/s	93539269.966op/s	93549947.036op/s	0.56%	-0.991	4.326	0.32%	21270.737op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[209.008µs; 209.067µs] or [-0.014%; +0.014%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[4783161.754op/s; 4784497.571op/s] or [-0.014%; +0.014%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[18.316µs; 18.327µs] or [-0.032%; +0.032%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[54563408.184op/s; 54597853.549op/s] or [-0.032%; +0.032%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[10.745µs; 10.755µs] or [-0.045%; +0.045%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[92979847.894op/s; 93063227.651op/s] or [-0.045%; +0.045%]	None	None	None

Group 13

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`f85951f`	1741986835	dsn/exporter_avoid_copy

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	66.337µs	66.583µs ± 0.222µs	66.523µs ± 0.073µs	66.619µs	66.909µs	67.233µs	68.638µs	3.18%	4.742	36.677	0.33%	0.016µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[66.552µs; 66.614µs] or [-0.046%; +0.046%]	None	None	None

Baseline

Omitted due to size.

codecov-commenter · 2025-03-13T22:08:17Z

Codecov Report

Attention: Patch coverage is 80.64516% with 18 lines in your changes missing coverage. Please review.

Project coverage is 72.79%. Comparing base (b39c6ee) to head (f85951f).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #926   +/-   ##
=======================================
  Coverage   72.79%   72.79%           
=======================================
  Files         334      334           
  Lines       50916    50882   -34     
=======================================
- Hits        37062    37041   -21     
+ Misses      13854    13841   -13

Components	Coverage Δ
crashtracker	`42.88% <ø> (+0.02%)`	⬆️
crashtracker-ffi	`6.25% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`91.81% <ø> (ø)`
data-pipeline-ffi	`90.28% <ø> (ø)`
ddcommon	`81.93% <ø> (+0.55%)`	⬆️
ddcommon-ffi	`67.57% <ø> (+1.46%)`	⬆️
ddtelemetry	`61.87% <ø> (ø)`
ddtelemetry-ffi	`22.46% <ø> (ø)`
dogstatsd	`89.70% <ø> (ø)`
dogstatsd-client	`82.57% <ø> (ø)`
ipc	`82.51% <ø> (+0.10%)`	⬆️
profiling	`81.74% <80.64%> (-0.21%)`	⬇️
profiling-ffi	`68.86% <56.09%> (-1.82%)`	⬇️
serverless	`0.00% <ø> (ø)`
sidecar	`40.97% <ø> (ø)`
sidecar-ffi	`1.17% <ø> (ø)`
spawn-worker	`54.37% <ø> (ø)`
tinybytes	`91.24% <ø> (ø)`
trace-mini-agent	`74.66% <ø> (ø)`
trace-normalization	`98.24% <ø> (ø)`
trace-obfuscation	`96.00% <ø> (ø)`
trace-protobuf	`78.13% <ø> (ø)`
trace-utils	`93.11% <ø> (ø)`

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ivoanjo

Gave it a pass!

Overall I like this PR less about the copying/allocation reduction and more as I think it's very useful to have a direct link between the profile and the exporter. In the past for instance we've had to expose the ProfiledEndpointsStats because it was not encoded in the pprof. With this change, we can trivially report more things (metrics? other info?) that also don't get encoded in the pprof.

ivoanjo · 2025-03-14T09:05:44Z

profiling-ffi/src/profiles.rs

-pub struct EncodedProfile {
-    start: Timespec,
-    end: Timespec,
-    buffer: ddcommon_ffi::Vec<u8>,
-    endpoints_stats: Box<ProfiledEndpointsStats>,
-}


Hmmm removing this leaves us in a weird in-between state -- I did find it useful to have the start/end here because in some cases I relied on libdatadog's defaults.

Providing start/end time is currently optional on a number of our APIs, so I think if we remove them here I think it's worth going ahead and also make them non-optional on the other APIs -- particularly in ddog_prof_Profile_serialize (make non-optional) and possibly in ddog_prof_Profile_reset/ddog_prof_Profile_new (remove them?).

I'd (mildly) rather do that on another PR so we can figure out the API we want, and when to use the current time vs requiring a time be passed in.

ivoanjo · 2025-03-14T09:09:23Z

profiling/src/exporter/mod.rs

    pub fn build(
        &self,
-        start: DateTime<Utc>,
-        end: DateTime<Utc>,
+        profile: EncodedProfile,
        files_to_compress_and_export: &[File],
        files_to_export_unmodified: &[File],
        additional_tags: Option<&Vec<Tag>>,
-        endpoint_counts: Option<&ProfiledEndpointsStats>,
        internal_metadata: Option<serde_json::Value>,
        info: Option<serde_json::Value>,
    ) -> anyhow::Result<Request> {


Consider perhaps making profile optional? In particular, that enables experimentation around e.g. if I don't want to send a profile, or want to post-process the profile before sending it. (It's not a very strong use-case, but making it optional seems quite simple, and it'll be so annoying to otherwise have to create "dummy" profiles just to get around this if it's not)

3786ad1 (#926)

It makes the code a bit uglier, since we have to handle option some places. LMK if you think its worth it

profiling-ffi/src/profiles.rs

profiling/src/exporter/mod.rs

…ofile bytes for tests

r1viollet · 2025-03-14T20:46:43Z

Artifact Size Benchmark Report

aarch64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a	84.34 MB	84.37 MB	+.02% (+22.95 KB) 🔍
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so	8.60 MB	8.60 MB	+0% (+224 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so.debug	26.69 MB	26.70 MB	+.04% (+12.29 KB) 🔍

aarch64-apple-darwin

Artifact	Baseline	Commit	Change
/aarch64-apple-darwin/lib/libdatadog_profiling.a	47.51 MB	47.54 MB	+.04% (+23.34 KB) 🔍
/aarch64-apple-darwin/lib/libdatadog_profiling.dylib	8.89 MB	8.89 MB	+0% (+224 B) 👌

aarch64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so	8.54 MB	8.54 MB	+0% (+224 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug	25.35 MB	25.37 MB	+.04% (+11.65 KB) 🔍
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a	79.00 MB	79.02 MB	+.02% (+20.95 KB) 🔍

i686-alpine-linux-musl

Artifact	Baseline	Commit	Change
/i686-alpine-linux-musl/lib/libdatadog_profiling.a	73.13 MB	73.16 MB	+.03% (+24.72 KB) 🔍
/i686-alpine-linux-musl/lib/libdatadog_profiling.so	9.15 MB	9.15 MB	+0% (+136 B) 👌
/i686-alpine-linux-musl/lib/libdatadog_profiling.so.debug	25.88 MB	25.89 MB	+.04% (+12.92 KB) 🔍

i686-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/i686-unknown-linux-gnu/lib/libdatadog_profiling.a	74.82 MB	74.85 MB	+.03% (+26.61 KB) 🔍
/i686-unknown-linux-gnu/lib/libdatadog_profiling.so	9.04 MB	9.04 MB	+0% (+128 B) 👌
/i686-unknown-linux-gnu/lib/libdatadog_profiling.so.debug	23.53 MB	23.54 MB	+.05% (+13.54 KB) 🔍

libdatadog-x64-windows

Artifact	Baseline	Commit	Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll	19.25 MB	19.26 MB	+.04% (+8.00 KB) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib	54.81 KB	55.15 KB	+.60% (+340 B) 🔍
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb	133.75 MB	133.82 MB	+.05% (+72.00 KB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib	861.05 MB	861.74 MB	+.07% (+701.43 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll	5.88 MB	5.88 MB	+.01% (+1.00 KB) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib	54.81 KB	55.15 KB	+.60% (+340 B) 🔍
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb	17.87 MB	17.87 MB	+.04% (+8.00 KB) 🔍
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib	30.17 MB	30.17 MB	+.02% (+7.27 KB) 🔍

libdatadog-x86-windows

Artifact	Baseline	Commit	Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll	16.40 MB	16.40 MB	+.04% (+7.00 KB) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib	55.66 KB	55.99 KB	+.60% (+344 B) 🔍
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb	135.98 MB	136.23 MB	+.18% (+264.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib	852.94 MB	852.58 MB	--.04% (-360.96 KB) 💪
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll	4.47 MB	4.47 MB	+.03% (+1.50 KB) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib	55.66 KB	55.99 KB	+.60% (+344 B) 🔍
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb	18.44 MB	18.44 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib	27.70 MB	27.70 MB	+.02% (+6.04 KB) 🔍

x86_64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a	73.13 MB	73.16 MB	+.03% (+24.72 KB) 🔍
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so	9.15 MB	9.15 MB	+0% (+136 B) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so.debug	25.88 MB	25.89 MB	+.04% (+12.92 KB) 🔍

x86_64-apple-darwin

Artifact	Baseline	Commit	Change
/x86_64-apple-darwin/lib/libdatadog_profiling.a	47.51 MB	47.54 MB	+.04% (+23.34 KB) 🔍
/x86_64-apple-darwin/lib/libdatadog_profiling.dylib	8.89 MB	8.89 MB	+0% (+224 B) 👌

x86_64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a	74.82 MB	74.85 MB	+.03% (+26.61 KB) 🔍
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so	9.04 MB	9.04 MB	+0% (+128 B) 👌
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug	23.53 MB	23.54 MB	+.05% (+13.54 KB) 🔍

github-actions bot added the profiling Relates to the profiling* modules. label Mar 13, 2025

danielsn force-pushed the dsn/exporter_avoid_copy branch from 8945950 to 0a70e3e Compare March 13, 2025 21:52

danielsn changed the title ~~DRAFT [profiling] Reduce copying and allocation in exporter~~ [profiling] Reduce copying and allocation in exporter Mar 13, 2025

danielsn force-pushed the dsn/exporter_avoid_copy branch from 0a70e3e to dfa6eb3 Compare March 13, 2025 21:53

[profiling] Reduce copying and allocation in exporter

606d179

danielsn force-pushed the dsn/exporter_avoid_copy branch from dfa6eb3 to 606d179 Compare March 13, 2025 21:54

danielsn marked this pull request as ready for review March 13, 2025 21:54

danielsn requested review from a team as code owners March 13, 2025 21:54

ivoanjo reviewed Mar 14, 2025

View reviewed changes

danielsn added 3 commits March 14, 2025 14:29

provide a way for us mortals outside of libdatadog to get back the pr…

2532499

…ofile bytes for tests

What are the ownership semantics after a profile gets passed in here

95d5ae6

Consider perhaps making profile optional

3786ad1

danielsn requested a review from a team as a code owner March 14, 2025 19:08

Merge branch 'main' into dsn/exporter_avoid_copy

5f90f56

danielsn added 2 commits March 14, 2025 17:10

don't unwrap

b667cde

map error

f85951f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[profiling] Reduce copying and allocation in exporter #926

[profiling] Reduce copying and allocation in exporter #926

danielsn commented Mar 13, 2025

pr-commenter bot commented Mar 13, 2025 •

edited

Loading

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

codecov-commenter commented Mar 13, 2025 •

edited

Loading

ivoanjo left a comment

ivoanjo Mar 14, 2025

danielsn Mar 14, 2025

ivoanjo Mar 14, 2025

danielsn Mar 14, 2025

r1viollet commented Mar 14, 2025 •

edited

Loading

[profiling] Reduce copying and allocation in exporter #926

Are you sure you want to change the base?

[profiling] Reduce copying and allocation in exporter #926

Conversation

danielsn commented Mar 13, 2025

What does this PR do?

Motivation

Additional Notes

How to test the change?

pr-commenter bot commented Mar 13, 2025 • edited Loading

Benchmarks

Comparison

scenario:redis/obfuscate_redis_string

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

Baseline

codecov-commenter commented Mar 13, 2025 • edited Loading

Codecov Report

ivoanjo left a comment

Choose a reason for hiding this comment

ivoanjo Mar 14, 2025

Choose a reason for hiding this comment

danielsn Mar 14, 2025

Choose a reason for hiding this comment

ivoanjo Mar 14, 2025

Choose a reason for hiding this comment

danielsn Mar 14, 2025

Choose a reason for hiding this comment

r1viollet commented Mar 14, 2025 • edited Loading

Artifact Size Benchmark Report

pr-commenter bot commented Mar 13, 2025 •

edited

Loading

codecov-commenter commented Mar 13, 2025 •

edited

Loading

r1viollet commented Mar 14, 2025 •

edited

Loading