[Profiler] Implement interning API #917

danielsn · 2025-03-11T13:47:04Z

What does this PR do?

Adds a new API for adding values to the profiler which interns the value, and then returns a handle to the interned value.

Motivation

The current API has two notable downsides

It requires the creation of a complex nested data-structure
It does not offer any way to take advantage of the interning capabilities of the profiler. Even if we know that two labels / strings / stack frames are the same, we re-add them every time.

This new API follows more of a builder pattern: you give a value, intern it, and then get a handle to the interned value which you can then reuse as needed.

Additional Notes

The new API is twice as fast as the old one on the C/C++ example, which do similar work

./profiles  10.85s user 0.35s system 98% cpu 11.366 total
./profile_intern  4.82s user 0.26s system 96% cpu 5.244 total

How to test the change?

The new C++ file that drives the API.

pr-commenter · 2025-03-11T13:52:22Z

Benchmarks

Comparison

Benchmark execution time: 2025-03-14 20:03:14

Comparing candidate commit d4bdb5c in PR branch dsn/r_and_d_week_mar_2024 with baseline commit b39c6ee in branch main.

Found 0 performance improvements and 6 performance regressions! Performance is the same for 46 metrics, 2 unstable metrics.

scenario:credit_card/is_card_number/378282246310005

🟥 execution_time [+8.692µs; +8.930µs] or [+11.388%; +11.701%]
🟥 throughput [-1373105.920op/s; -1338040.975op/s] or [-10.480%; -10.212%]

scenario:credit_card/is_card_number/x371413321323331

🟥 execution_time [+508.297ns; +529.308ns] or [+8.418%; +8.766%]
🟥 throughput [-13325672.589op/s; -12839276.096op/s] or [-8.047%; -7.753%]

scenario:credit_card/is_card_number_no_luhn/x371413321323331

🟥 execution_time [+519.338ns; +539.468ns] or [+8.598%; +8.931%]
🟥 throughput [-13555466.566op/s; -13090577.214op/s] or [-8.188%; -7.907%]

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	3.895µs	3.913µs ± 0.004µs	3.913µs ± 0.002µs	3.915µs	3.918µs	3.920µs	3.940µs	0.68%	1.777	18.776	0.09%	0.000µs	1	200
credit_card/is_card_number/	throughput	253826337.700op/s	255545379.318op/s ± 232397.004op/s	255550078.307op/s ± 110941.104op/s	255659498.359op/s	255871500.132op/s	255958115.208op/s	256739873.691op/s	0.47%	-1.729	18.539	0.09%	16432.950op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	81.948µs	82.430µs ± 0.391µs	82.360µs ± 0.099µs	82.546µs	82.803µs	82.931µs	87.171µs	5.84%	9.040	106.705	0.47%	0.028µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	11471641.397op/s	12131723.138op/s ± 55198.435op/s	12141835.750op/s ± 14551.389op/s	12151328.461op/s	12175813.989op/s	12192876.231op/s	12202934.806op/s	0.50%	-8.661	100.553	0.45%	3903.119op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	76.467µs	77.099µs ± 0.469µs	77.057µs ± 0.198µs	77.262µs	77.554µs	78.022µs	82.367µs	6.89%	7.186	77.864	0.61%	0.033µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	12140784.537op/s	12970730.856op/s ± 75703.669op/s	12977480.841op/s ± 33348.723op/s	13005991.212op/s	13047037.769op/s	13070099.480op/s	13077468.307op/s	0.77%	-6.687	70.468	0.58%	5353.058op/s	1	200
credit_card/is_card_number/37828224631	execution_time	3.899µs	3.914µs ± 0.005µs	3.914µs ± 0.002µs	3.916µs	3.919µs	3.924µs	3.961µs	1.20%	4.519	42.978	0.12%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	252474791.291op/s	255481203.208op/s ± 311892.436op/s	255510047.688op/s ± 143239.781op/s	255628419.490op/s	255840096.981op/s	255935670.135op/s	256487755.419op/s	0.38%	-4.429	41.908	0.12%	22054.126op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	83.817µs	85.134µs ± 0.741µs	85.093µs ± 0.518µs	85.621µs	86.304µs	86.954µs	88.288µs	3.75%	0.573	0.825	0.87%	0.052µs	1	200
credit_card/is_card_number/378282246310005	throughput	11326580.644op/s	11747117.847op/s ± 101772.140op/s	11751874.898op/s ± 71566.551op/s	11819614.708op/s	11906019.474op/s	11922947.354op/s	11930732.191op/s	1.52%	-0.510	0.629	0.86%	7196.377op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	51.856µs	52.156µs ± 0.094µs	52.150µs ± 0.060µs	52.215µs	52.308µs	52.368µs	52.459µs	0.59%	0.098	0.363	0.18%	0.007µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	19062483.246op/s	19173333.977op/s ± 34478.437op/s	19175447.757op/s ± 22191.200op/s	19194083.904op/s	19232004.948op/s	19245356.367op/s	19284249.027op/s	0.57%	-0.085	0.361	0.18%	2437.994op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	6.426µs	6.557µs ± 0.076µs	6.552µs ± 0.048µs	6.604µs	6.682µs	6.734µs	6.789µs	3.62%	0.479	-0.200	1.15%	0.005µs	1	200
credit_card/is_card_number/x371413321323331	throughput	147286422.543op/s	152523202.832op/s ± 1752024.301op/s	152625275.670op/s ± 1114432.131op/s	153615741.939op/s	155287178.362op/s	155558947.598op/s	155620342.617op/s	1.96%	-0.426	-0.280	1.15%	123886.826op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	3.897µs	3.915µs ± 0.003µs	3.915µs ± 0.001µs	3.917µs	3.919µs	3.920µs	3.921µs	0.14%	-1.667	6.753	0.07%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	255063840.832op/s	255441567.012op/s ± 188387.919op/s	255414397.873op/s ± 96554.385op/s	255517253.803op/s	255789136.366op/s	255916194.770op/s	256613677.700op/s	0.47%	1.681	6.847	0.07%	13321.038op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	63.443µs	63.878µs ± 0.250µs	63.850µs ± 0.058µs	63.906µs	64.219µs	64.336µs	66.185µs	3.66%	5.816	46.925	0.39%	0.018µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	15109243.681op/s	15655194.583op/s ± 59981.579op/s	15661744.623op/s ± 14164.303op/s	15676852.735op/s	15707842.799op/s	15753439.495op/s	15762273.591op/s	0.64%	-5.641	44.884	0.38%	4241.338op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	57.419µs	57.575µs ± 0.128µs	57.537µs ± 0.045µs	57.607µs	57.883µs	58.101µs	58.132µs	1.03%	2.394	6.361	0.22%	0.009µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	17202084.853op/s	17368799.795op/s ± 38490.987op/s	17380051.178op/s ± 13674.038op/s	17390073.311op/s	17405704.472op/s	17414333.634op/s	17415958.243op/s	0.21%	-2.376	6.273	0.22%	2721.724op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	3.893µs	3.914µs ± 0.003µs	3.915µs ± 0.001µs	3.916µs	3.918µs	3.920µs	3.935µs	0.51%	-0.578	18.227	0.08%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	254157438.746op/s	255470386.804op/s ± 204950.166op/s	255457070.440op/s ± 80979.177op/s	255538944.137op/s	255812227.778op/s	255906871.739op/s	256881405.446op/s	0.56%	0.626	18.260	0.08%	14492.165op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	54.557µs	54.817µs ± 0.295µs	54.683µs ± 0.046µs	54.971µs	55.266µs	56.032µs	56.464µs	3.26%	2.603	8.981	0.54%	0.021µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	17710488.674op/s	18243146.955op/s ± 96797.620op/s	18287280.282op/s ± 15428.165op/s	18299159.154op/s	18311234.951op/s	18321636.334op/s	18329385.671op/s	0.23%	-2.537	8.477	0.53%	6844.625op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	51.926µs	52.174µs ± 0.087µs	52.173µs ± 0.052µs	52.222µs	52.323µs	52.399µs	52.426µs	0.49%	0.093	0.544	0.17%	0.006µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	19074563.761op/s	19166513.877op/s ± 31949.268op/s	19167120.721op/s ± 19247.365op/s	19187007.041op/s	19219815.246op/s	19245017.065op/s	19258161.574op/s	0.47%	-0.080	0.544	0.17%	2259.154op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	6.437µs	6.570µs ± 0.072µs	6.558µs ± 0.047µs	6.625µs	6.706µs	6.735µs	6.760µs	3.08%	0.395	-0.501	1.10%	0.005µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	147925720.323op/s	152234933.931op/s ± 1673084.561op/s	152479220.706op/s ± 1107962.284op/s	153480410.027op/s	154447685.501op/s	155301059.590op/s	155359489.524op/s	1.89%	-0.351	-0.552	1.10%	118304.944op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[3.913µs; 3.914µs] or [-0.013%; +0.013%]	None	None	None
credit_card/is_card_number/	throughput	[255513171.328op/s; 255577587.307op/s] or [-0.013%; +0.013%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[82.376µs; 82.484µs] or [-0.066%; +0.066%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[12124073.165op/s; 12139373.110op/s] or [-0.063%; +0.063%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[77.034µs; 77.164µs] or [-0.084%; +0.084%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[12960239.055op/s; 12981222.656op/s] or [-0.081%; +0.081%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[3.914µs; 3.915µs] or [-0.017%; +0.017%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[255437977.916op/s; 255524428.500op/s] or [-0.017%; +0.017%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[85.031µs; 85.236µs] or [-0.121%; +0.121%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[11733013.207op/s; 11761222.486op/s] or [-0.120%; +0.120%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[52.143µs; 52.169µs] or [-0.025%; +0.025%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[19168555.597op/s; 19178112.357op/s] or [-0.025%; +0.025%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[6.547µs; 6.568µs] or [-0.160%; +0.160%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[152280389.114op/s; 152766016.550op/s] or [-0.159%; +0.159%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[3.914µs; 3.915µs] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[255415458.258op/s; 255467675.765op/s] or [-0.010%; +0.010%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[63.843µs; 63.912µs] or [-0.054%; +0.054%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[15646881.713op/s; 15663507.453op/s] or [-0.053%; +0.053%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[57.557µs; 57.593µs] or [-0.031%; +0.031%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[17363465.315op/s; 17374134.276op/s] or [-0.031%; +0.031%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[3.914µs; 3.915µs] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[255441982.682op/s; 255498790.925op/s] or [-0.011%; +0.011%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[54.776µs; 54.858µs] or [-0.075%; +0.075%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[18229731.736op/s; 18256562.174op/s] or [-0.074%; +0.074%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[52.162µs; 52.187µs] or [-0.023%; +0.023%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[19162086.016op/s; 19170941.738op/s] or [-0.023%; +0.023%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[6.560µs; 6.580µs] or [-0.153%; +0.153%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[152003060.502op/s; 152466807.360op/s] or [-0.152%; +0.152%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
concentrator/add_spans_to_concentrator	execution_time	5.943ms	5.956ms ± 0.008ms	5.955ms ± 0.003ms	5.958ms	5.964ms	5.995ms	6.014ms	1.00%	4.404	27.316	0.13%	0.001ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
concentrator/add_spans_to_concentrator	execution_time	[5.955ms; 5.957ms] or [-0.018%; +0.018%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	33.254µs	33.938µs ± 1.061µs	33.485µs ± 0.089µs	33.580µs	36.204µs	36.241µs	37.107µs	10.81%	1.699	0.981	3.12%	0.075µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[33.791µs; 34.085µs] or [-0.433%; +0.433%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	504.844µs	505.879µs ± 0.878µs	505.807µs ± 0.245µs	506.061µs	506.512µs	506.840µs	516.979µs	2.21%	10.169	126.224	0.17%	0.062µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1934314.339op/s	1976761.865op/s ± 3371.195op/s	1977040.302op/s ± 955.430op/s	1977966.823op/s	1979419.829op/s	1980054.524op/s	1980811.003op/s	0.19%	-10.040	124.065	0.17%	238.380op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	452.498µs	453.315µs ± 0.307µs	453.283µs ± 0.198µs	453.529µs	453.826µs	453.921µs	454.093µs	0.18%	0.115	-0.372	0.07%	0.022µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2202191.335op/s	2205970.495op/s ± 1492.046op/s	2206125.473op/s ± 963.914op/s	2206934.968op/s	2208450.911op/s	2209051.435op/s	2209956.961op/s	0.17%	-0.112	-0.372	0.07%	105.504op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	174.754µs	176.528µs ± 0.485µs	176.596µs ± 0.329µs	176.872µs	177.200µs	177.356µs	177.434µs	0.47%	-0.849	1.406	0.27%	0.034µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5635910.814op/s	5664882.238op/s ± 15602.079op/s	5662647.893op/s ± 10546.149op/s	5674732.436op/s	5689109.175op/s	5721954.849op/s	5722329.675op/s	1.05%	0.871	1.479	0.27%	1103.234op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	37.588µs	37.707µs ± 0.046µs	37.712µs ± 0.029µs	37.734µs	37.785µs	37.805µs	37.813µs	0.27%	-0.053	-0.430	0.12%	0.003µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	26446195.599op/s	26520293.647op/s ± 32341.412op/s	26516611.439op/s ± 20211.230op/s	26543573.492op/s	26573082.659op/s	26589591.645op/s	26604457.608op/s	0.33%	0.059	-0.430	0.12%	2286.883op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	48.199µs	48.327µs ± 0.105µs	48.315µs ± 0.021µs	48.341µs	48.388µs	48.417µs	49.708µs	2.88%	11.381	147.164	0.22%	0.007µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	20117624.090op/s	20692353.986op/s ± 43936.405op/s	20697547.329op/s ± 9012.250op/s	20704545.084op/s	20721478.085op/s	20738694.336op/s	20747117.860op/s	0.24%	-11.253	144.941	0.21%	3106.773op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[505.758µs; 506.001µs] or [-0.024%; +0.024%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1976294.650op/s; 1977229.081op/s] or [-0.024%; +0.024%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[453.273µs; 453.358µs] or [-0.009%; +0.009%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2205763.712op/s; 2206177.278op/s] or [-0.009%; +0.009%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[176.460µs; 176.595µs] or [-0.038%; +0.038%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5662719.940op/s; 5667044.536op/s] or [-0.038%; +0.038%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[37.701µs; 37.713µs] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[26515811.438op/s; 26524775.856op/s] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[48.313µs; 48.342µs] or [-0.030%; +0.030%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[20686264.823op/s; 20698443.149op/s] or [-0.029%; +0.029%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	54.381ms	54.825ms ± 0.166ms	54.777ms ± 0.073ms	54.896ms	55.106ms	55.421ms	55.686ms	1.66%	1.971	6.283	0.30%	0.012ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[54.802ms; 54.848ms] or [-0.042%; +0.042%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	147.096µs	148.091µs ± 0.432µs	148.051µs ± 0.146µs	148.214µs	148.546µs	149.796µs	151.520µs	2.34%	3.987	27.768	0.29%	0.031µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[148.031µs; 148.151µs] or [-0.040%; +0.040%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	246.004ns	254.131ns ± 11.357ns	248.876ns ± 2.039ns	255.598ns	284.493ns	288.358ns	288.613ns	15.97%	1.912	2.438	4.46%	0.803ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[252.557ns; 255.705ns] or [-0.619%; +0.619%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	68.179µs	68.536µs ± 0.205µs	68.523µs ± 0.055µs	68.583µs	68.672µs	69.077µs	70.916µs	3.49%	8.166	90.828	0.30%	0.014µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[68.508µs; 68.565µs] or [-0.041%; +0.041%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
ip_address/quantize_peer_ip_address_benchmark	execution_time	4.982µs	5.060µs ± 0.057µs	5.045µs ± 0.038µs	5.090µs	5.155µs	5.159µs	5.161µs	2.30%	0.589	-1.072	1.12%	0.004µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark	execution_time	[5.053µs; 5.068µs] or [-0.155%; +0.155%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.160µs	3.170µs ± 1.431µs	2.980µs ± 0.029µs	3.003µs	3.633µs	13.879µs	14.937µs	401.23%	7.395	55.652	45.02%	0.101µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[2.972µs; 3.369µs] or [-6.255%; +6.255%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.424µs	2.454µs ± 0.026µs	2.448µs ± 0.012µs	2.461µs	2.525µs	2.530µs	2.532µs	3.42%	1.602	2.132	1.05%	0.002µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.450µs; 2.458µs] or [-0.146%; +0.146%]	None	None	None

Group 12

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	208.583µs	209.073µs ± 0.165µs	209.088µs ± 0.095µs	209.161µs	209.325µs	209.427µs	209.673µs	0.28%	-0.238	0.920	0.08%	0.012µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	4769323.078op/s	4783027.271op/s ± 3774.971op/s	4782676.190op/s ± 2164.774op/s	4785083.141op/s	4790376.389op/s	4792503.604op/s	4794260.950op/s	0.24%	0.244	0.917	0.08%	266.931op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	18.248µs	18.343µs ± 0.120µs	18.340µs ± 0.029µs	18.363µs	18.399µs	18.426µs	19.919µs	8.61%	11.449	148.489	0.65%	0.008µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	50202116.251op/s	54519943.084op/s ± 331823.469op/s	54525951.833op/s ± 85892.258op/s	54631788.645op/s	54786128.882op/s	54795765.201op/s	54800324.197op/s	0.50%	-11.063	141.857	0.61%	23463.463op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	10.661µs	10.720µs ± 0.027µs	10.719µs ± 0.018µs	10.739µs	10.765µs	10.781µs	10.792µs	0.69%	0.123	-0.257	0.25%	0.002µs	1	200
normalization/normalize_name/normalize_name/good	throughput	92659051.043op/s	93287669.252op/s ± 234740.148op/s	93294226.307op/s ± 160917.697op/s	93445952.513op/s	93707879.893op/s	93752482.412op/s	93799086.686op/s	0.54%	-0.110	-0.264	0.25%	16598.635op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[209.050µs; 209.096µs] or [-0.011%; +0.011%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[4782504.096op/s; 4783550.446op/s] or [-0.011%; +0.011%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[18.326µs; 18.359µs] or [-0.091%; +0.091%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[54473955.543op/s; 54565930.626op/s] or [-0.084%; +0.084%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[10.716µs; 10.723µs] or [-0.035%; +0.035%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[93255136.525op/s; 93320201.979op/s] or [-0.035%; +0.035%]	None	None	None

Group 13

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`d4bdb5c`	1741981904	dsn/r_and_d_week_mar_2024

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	17.376µs	25.085µs ± 10.530µs	17.630µs ± 0.156µs	34.192µs	44.300µs	45.356µs	84.773µs	380.83%	1.613	4.500	41.87%	0.745µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[23.626µs; 26.544µs] or [-5.818%; +5.818%]	None	None	None

Baseline

Omitted due to size.

codecov-commenter · 2025-03-11T14:02:17Z

Codecov Report

Attention: Patch coverage is 20.43222% with 405 lines in your changes missing coverage. Please review.

Project coverage is 72.29%. Comparing base (9a4a791) to head (d4bdb5c).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #917      +/-   ##
==========================================
- Coverage   72.44%   72.29%   -0.16%     
==========================================
  Files         333      338       +5     
  Lines       50097    51418    +1321     
==========================================
+ Hits        36294    37172     +878     
- Misses      13803    14246     +443

Components	Coverage Δ
crashtracker	`42.85% <ø> (-0.03%)`	⬇️
crashtracker-ffi	`6.25% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`91.81% <ø> (-0.28%)`	⬇️
data-pipeline-ffi	`90.28% <ø> (ø)`
ddcommon	`80.16% <58.27%> (+0.97%)`	⬆️
ddcommon-ffi	`65.10% <58.27%> (+4.05%)`	⬆️
ddtelemetry	`61.87% <ø> (+0.12%)`	⬆️
ddtelemetry-ffi	`22.46% <ø> (ø)`
dogstatsd	`89.70% <ø> (+0.10%)`	⬆️
dogstatsd-client	`82.57% <ø> (ø)`
ipc	`82.41% <ø> (+0.01%)`	⬆️
profiling	`77.82% <6.21%> (-4.12%)`	⬇️
profiling-ffi	`63.86% <2.48%> (-6.82%)`	⬇️
serverless	`0.00% <ø> (ø)`
sidecar	`40.97% <ø> (+0.33%)`	⬆️
sidecar-ffi	`1.17% <ø> (-2.06%)`	⬇️
spawn-worker	`54.37% <ø> (ø)`
tinybytes	`91.24% <ø> (+0.02%)`	⬆️
trace-mini-agent	`74.66% <ø> (ø)`
trace-normalization	`98.24% <ø> (+<0.01%)`	⬆️
trace-obfuscation	`96.00% <ø> (-0.07%)`	⬇️
trace-protobuf	`78.13% <ø> (ø)`
trace-utils	`93.11% <ø> (+0.13%)`	⬆️

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

taegyunkim · 2025-03-11T14:38:16Z

profiling-ffi/src/profiles/interning_api.rs

+};
+use function_name::named;
+
+/// This functions interns its argument into the profiler.


Suggested change

/// This functions interns its argument into the profiler.

/// This function interns its argument into the profiler.

Few more in this file.

taegyunkim

This reminded me of @ivoanjo 's ManagedStringStorage.
Correct me if I understood it wrong.

this change uses strings: StringTable where ivo's PR uses string_storage: Option<...<...<ManagedStringStorage>>> from internal::Profile.
we create a new StringTable for each Profile, while string_storage persists across Profiles

ivoanjo · 2025-03-13T14:27:09Z

profiling-ffi/src/profiles/interning_api.rs

+/// This functions interns its argument into the profiler.
+/// If successful, it an opaque interning ID.
+/// This ID is valid for use on this profiler, until the profiler is reset.
+/// It is an error to use this id after the profiler has been reset, or on a different profiler.
+/// On error, it holds an error message in the error variant.
+///
+/// # Safety
+/// The `profile` ptr must point to a valid Profile object created by this
+/// module.
+/// All other arguments must remain valid for the length of this call.
+/// This call is _NOT_ thread-safe.
+#[must_use]
+#[no_mangle]
+#[named]
+pub unsafe extern "C" fn ddog_prof_Profile_intern_sample(
+    profile: *mut Profile,
+    stacktrace: GenerationalId<StackTraceId>,
+    values: Slice<i64>,
+    labels: GenerationalId<LabelSetId>,
+    timestamp: Option<NonZeroI64>,
+) -> VoidResult {


"If successful, it an opaque interning ID." -> doesn't apply to this one, which returns VoidResult (also needs a bit of a text pass -- shows up in other places ;D )

morrisonlevi

I have to jump to other things, but figured I'd leave my partial review.

ddcommon-ffi/src/slice_mut.rs

examples/ffi/profile_intern.cpp

ivoanjo · 2025-03-13T16:42:54Z

profiling/src/internal/profile/interning_api/mod.rs

+    pub fn intern_label_num(
+        &mut self,
+        key: GenerationalId<StringId>,
+        val: i64,
+        unit: Option<GenerationalId<StringId>>,
+    ) -> anyhow::Result<GenerationalId<LabelId>> {
+        let key = key.get(self.generation)?;
+        let unit = unit.map(|u| u.get(self.generation)).transpose()?;
+        let id = self.labels.dedup(Label::num(key, val, unit));
+        Ok(GenerationalId::new(id, self.generation))
+    }
+
+    pub fn intern_label_str(
+        &mut self,
+        key: GenerationalId<StringId>,
+        val: GenerationalId<StringId>,
+    ) -> anyhow::Result<GenerationalId<LabelId>> {
+        let key = key.get(self.generation)?;
+        let val = val.get(self.generation)?;
+        let id = self.labels.dedup(Label::str(key, val));
+        Ok(GenerationalId::new(id, self.generation))
+    }
+
+    pub fn intern_labelset(
+        &mut self,
+        labels: &[GenerationalId<LabelId>],
+    ) -> anyhow::Result<GenerationalId<LabelSetId>> {
+        let labels = labels
+            .iter()
+            .map(|l| l.get(self.generation))
+            .collect::<anyhow::Result<Vec<_>>>()?;
+        let labels = LabelSet::new(labels);
+        let id = self.label_sets.dedup(labels);
+        Ok(GenerationalId::new(id, self.generation))
+    }


Using these functions skips the Profile::validate_sample_labels that the add would check before allowing them. I've noticed this because I had a test for this failure being handled correctly on the Ruby side, and the test did not see the error it expected.

My feeling was that these checks should maybe be a debug_assert, but running them on every insert in prod is wasted CPU cycles

danielsn added 3 commits March 10, 2025 12:56

[Profiler] Implement interning API

670bda7

FFI API

33d4266

c++ example

17dba1f

danielsn requested review from a team as code owners March 11, 2025 13:47

github-actions bot added the profiling Relates to the profiling* modules. label Mar 11, 2025

taegyunkim reviewed Mar 11, 2025

View reviewed changes

danielsn added 8 commits March 12, 2025 14:42

bulk string interning

666f5d9

add generations are equal api

d765dbf

forgot to add file

47c5679

Intern managed strings

c105ef0

Add sample_start machinery, haven't wired it to the exporter yet

fa37449

fix annoying 1.78 issue, and do more with the state machine

68fd98b

Merge branch 'main' into dsn/r_and_d_week_mar_2024

3af720d

fix to use the new slice based input

6d3c4cd

github-actions bot added the common label Mar 12, 2025

ivoanjo reviewed Mar 13, 2025

View reviewed changes

morrisonlevi reviewed Mar 13, 2025

View reviewed changes

ddcommon-ffi/src/slice_mut.rs Outdated Show resolved Hide resolved

ddcommon-ffi/src/slice_mut.rs Outdated Show resolved Hide resolved

ddcommon-ffi/src/slice_mut.rs Show resolved Hide resolved

ivoanjo reviewed Mar 13, 2025

View reviewed changes

examples/ffi/profile_intern.cpp Outdated Show resolved Hide resolved

ivoanjo reviewed Mar 13, 2025

View reviewed changes

ivoanjo mentioned this pull request Mar 13, 2025

[NO-TICKET] Experiment to replace Profile_add libdatadog interning API DataDog/dd-trace-rb#4492

Draft

Interned empty string, as Ivo requested

33380b9

danielsn force-pushed the dsn/r_and_d_week_mar_2024 branch from 8145a59 to 33380b9 Compare March 13, 2025 17:36

danielsn added 3 commits March 13, 2025 13:38

Merge branch 'main' into dsn/r_and_d_week_mar_2024

2a2be98

PR comment: rename long identifier

176b144

PR comments, use nonnull instead of pointer

cb79525

danielsn added 2 commits March 13, 2025 14:36

YAGNI

717cee1

Interned string constant

d4bdb5c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Profiler] Implement interning API #917

[Profiler] Implement interning API #917

danielsn commented Mar 11, 2025

pr-commenter bot commented Mar 11, 2025 •

edited

Loading

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

codecov-commenter commented Mar 11, 2025 •

edited

Loading

taegyunkim Mar 11, 2025 •

edited

Loading

taegyunkim left a comment •

edited

Loading

ivoanjo Mar 13, 2025

morrisonlevi left a comment

ivoanjo Mar 13, 2025

danielsn Mar 13, 2025

	/// This functions interns its argument into the profiler.
	/// This function interns its argument into the profiler.

[Profiler] Implement interning API #917

Are you sure you want to change the base?

[Profiler] Implement interning API #917

Conversation

danielsn commented Mar 11, 2025

What does this PR do?

Motivation

Additional Notes

How to test the change?

pr-commenter bot commented Mar 11, 2025 • edited Loading

Benchmarks

Comparison

scenario:credit_card/is_card_number/378282246310005

scenario:credit_card/is_card_number/x371413321323331

scenario:credit_card/is_card_number_no_luhn/x371413321323331

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

Baseline

codecov-commenter commented Mar 11, 2025 • edited Loading

Codecov Report

taegyunkim Mar 11, 2025 • edited Loading

Choose a reason for hiding this comment

taegyunkim left a comment • edited Loading

Choose a reason for hiding this comment

ivoanjo Mar 13, 2025

Choose a reason for hiding this comment

morrisonlevi left a comment

Choose a reason for hiding this comment

ivoanjo Mar 13, 2025

Choose a reason for hiding this comment

danielsn Mar 13, 2025

Choose a reason for hiding this comment

pr-commenter bot commented Mar 11, 2025 •

edited

Loading

codecov-commenter commented Mar 11, 2025 •

edited

Loading

taegyunkim Mar 11, 2025 •

edited

Loading

taegyunkim left a comment •

edited

Loading