Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(otlp): Infer span description for
db
spans #4541base: master
Are you sure you want to change the base?
feat(otlp): Infer span description for
db
spans #4541Changes from 6 commits
2716ef3
cf723e1
6d163f4
c47a7db
27ee011
68bca0d
6a3ad6d
9257e6a
e733731
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the SQL scrubber need to run over this field, also if the field gets copied into the description, is it being scrubbed?
We should have an integration test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span
description
that we receive isn't scrubbed or otherwise modified. So writing the unmodifieddb.query.text
value intodescription
is expected.(The exception is PII scrubbing, although
description
isn't a default field so that relies on user configuration. That happens later in span processing so we should be covered there.)A scrubbed copy of
description
is made inextract_tags
and stored insentry_tags
'description
field. This scrubbed version is used by Sentry's Insights modules. That scrubbed description extraction does not run for the inferred descriptions in this PR, as the scrubbing happens inextract_tags
.That is fine, though: supporting Insights is a later milestone of the span first/OTLP project. It isn't expected that Insights works with OTLP. If it was simple enough to generate the scrubbed descriptions I'd consider it, but the scrubbing also currently depends on
op
- that has all got to be replaced too before it's usable with OTLP. We'll be changing how that works when we work on that milestone.I agree about an integration test though, we'll add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this not use the original place where this tag is extracted from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tag is typically inferred from either the span op or a combination of its attributes, rather than being passed in directly. This inference happens in
tag_extraction::extract_tags
, with the result being written into sentry_tags and retrieved here.This code to set the span
description
could be moved intoextract_tags
, but it would mean making itsspan
parameter a mutable reference and mutating span in there (setting itsdescription
), which feels a bit out of scope of what I'd expectextract_tags
to do.An alternative might be to make
category
a field onSpan
, set it earlier in the process, and use that everywhere we want to query category instead of digging into sentry_tags. (It would then be the responsibility of Snuba to write it into the attributes dictionary I suppose).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems weird to refer back to a tag to make the decision, but if that is currently the best way to do it, I am fine with it, especially since we have strongly typed tags now.
Jan and I discussed some changes how we can improve the tag extraction and span enrichment we do. In regards to needing something similar/mirrored in Sentry for a while and starting to align for span streaming. So we hopefully can improve this case with that change as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a
promote_span_data_fields
function which seems to be a perfect fit for this logic.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promote_span_data_fields
seems to be moving properties out of data and into top level span attributes:https://github.com/getsentry/relay/blob/master/relay-server/src/services/processor/span/processing.rs#L689-L712
Does it make sense to move this kind of logic, which is conditional on other span fields and also doesn't move data, into the same spot?
The other challenge is that we need to know the category, which isn't calculated yet at the time
promote_span_data_fields
is run. (See other comments).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.
Would be great if you can add this as a comment (if you end up staying with what we currently have):
It should be obvious (because it references the tags), but a lot of this extraction logic depends on implicit orders, so maybe doesn't hurt to call it out and explain why it runs late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented the block where this happens in normalization, as well as in the function doc. I couldn't find a way to make the dependency more explicit in code under the current structure (passing category into
infer_span_description
, extracting the category inference, etc).You mentioned discussing updates to tag extraction + enrichment with Jan - my understanding is that sentry_tags as a concept is likely to disappear as well, as EAP already collapses them into standard attributes. Once Relay does the same, I think this whole method should be able to be ordered more naturally so the dependencies are clear. 🤞