-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-45788: [C++][Acero] Fix data race in aggregate node #45789
base: main
Are you sure you want to change the base?
Conversation
|
Hi @pitrou , would you like to take a look? Thanks. |
@github-actions crossbow submit emscripten |
Revision: 9d8ddbb Submitted crossbow builds: ursacomputing/crossbow @ actions-73916ce3ac
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also makes Segmenter::GetSegments
a const fn? (Or I can do it in a separate patch )
I guess not. Both arrow/cpp/src/arrow/compute/row/grouper.cc Line 163 in 7e18764
arrow/cpp/src/arrow/compute/row/grouper.cc Line 273 in 7e18764
|
@@ -312,7 +312,7 @@ Result<ExecBatch> GroupByNode::Finalize() { | |||
segment_key_field_ids_.size()); | |||
|
|||
// Segment keys come first | |||
PlaceFields(out_data, 0, segmenter_values_); | |||
PlaceFields(out_data, 0, state->segmenter_values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the Finalize
step only considers the segmenter values for state[0]
? I'm not sure I understand why.
RETURN_NOT_OK( | ||
ExtractSegmenterValues(&segmenter_values_, exec_batch, segment_field_ids_)); | ||
RETURN_NOT_OK(ExtractSegmenterValues(&GetLocalState()->segmenter_values, exec_batch, | ||
segment_field_ids_)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OutputResult
seems non-thread safe, how can InputReceived
be called from several threads at once? Am I missing something?
Rationale for this change
Data race described in #45788 .
What changes are included in this PR?
Put the racing member
segmenter_values
in thread local state.Are these changes tested?
Yes. UT added.
Are there any user-facing changes?
None.