-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable custom serialization #3789
Disable custom serialization #3789
Conversation
c4cc534
to
2577292
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add unit tests for the same?
2577292
to
e8918ab
Compare
Updated UTs for SecurityInterceptor and SecuritySSLRequestHandler |
src/test/java/org/opensearch/security/transport/SecuritySSLRequestHandlerTests.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/security/transport/SecurityInterceptorTests.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/security/transport/SecurityInterceptorTests.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/security/transport/SecurityInterceptorTests.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/security/transport/SecurityInterceptorTests.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/security/OpenSearchSecurityPlugin.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/security/support/SecuritySettings.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/security/transport/SecurityInterceptor.java
Outdated
Show resolved
Hide resolved
e8918ab
to
b56f621
Compare
Updated the PR. We will be keeping custom serialization disabled for the |
2.11.1
patch release
88ef6d0
to
f8d3709
Compare
src/main/java/org/opensearch/security/support/Base64Helper.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/security/transport/SecurityInterceptor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/security/transport/SecurityInterceptor.java
Outdated
Show resolved
Hide resolved
f8d3709
to
9a25624
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments, thanks for this contribution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like test cases are failing due to serialization issues, please look into the failures on the integrationTest test-citests, and test dlicDlsflsTest
2> java.lang.AssertionError:
Expected: Successful multi get response
but: "Get an item from index: ""first-test-index" failed: "OpenSearch exception [type=exception, reason=java.lang.IllegalArgumentException: -84 is not a valid id]"
at __randomizedtesting.SeedInfo.seed([52523052604917F5:E52F053CD1B7F4D]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at org.opensearch.security.FlsAndFieldMaskingTests.multiGetDocuments(FlsAndFieldMaskingTests.java:722)
2> java.lang.AssertionError:
Expected: Successful multi search response
but: "Get an item failed: ""OpenSearch exception [type=exception, reason=java.lang.IllegalArgumentException: -84 is not a valid id]"
at __randomizedtesting.SeedInfo.seed([52523052604917F5:ECF7F2CDE87C8BDC]:0)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at org.opensearch.security.FlsAndFieldMaskingTests.multiSearchDocuments(FlsAndFieldMaskingTests.java:769)
9a25624
to
1d9bcee
Compare
A change was needed at SecurityFilter:187, from -
to -
as we now are defaulting to JDK serialization instead of custom. Have amended the commit to include this change. Will monitor the CI checks. |
Integration tests and unit tests have passed now. FYI backward compatibility tests are expected to fail as they're running against v2.12.0 vs v3.0.0 where-in both OS versions use custom serialization, but our change assumes that any version > v2.11.0 will be using JDK serialization for now. Not sure how can we run the suite with previous version 2.11.1. The |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3789 +/- ##
==========================================
- Coverage 65.26% 65.25% -0.01%
==========================================
Files 297 298 +1
Lines 21132 21165 +33
Branches 3452 3455 +3
==========================================
+ Hits 13791 13811 +20
- Misses 5644 5656 +12
- Partials 1697 1698 +1
|
@parasjain1 One way we can fix the checks is by first merging this change into 2.x, running the CI again and then merging this into main. |
Got it. Should I modify this PR to set the destination branch to |
Can you keep this PR open and open another manual backport against 2.x? |
1d9bcee
to
d917957
Compare
@parasjain1 Please look into the other failures all the BWC are failing - and it looks like there are spotless code style changes needed |
d917957
to
482380f
Compare
482380f
to
af2b517
Compare
Signed-off-by: Paras Jain <[email protected]>
af2b517
to
b7b8aa4
Compare
As mentioned above, BWCs are expected to fail. @cwperks suggested to open a Backport to 2.x PR which can be merged first. Working on that now. |
Have fixed the spotless scan failures. One of the plugin install tests are failing due to the following error which looks unrelated to the code change.
|
All other CI tests except plugin install (mentioned in the above comment) and BWCs are passing. |
Tried running BWCs on the backport PR #3826 by setting the previous branch to I'm highlighting below the steps that we need to carry out in the mentioned order to successfully test for backward compatibility -
|
@parasjain1 This change has be lingering and there is still complexity around targeted validation. I'm starting to thing that we should fix the root cause rather than disable the feature. What is preventing us from fixing the root issue? |
To reflect on the root cause, JDK de-dupes object references using back references for the objects that it has already written on the stream, while OpenSearch's StreamInput/StreamOutput implementations do not offer such a support. The ideal fix should come as an optimization in how OpenSearch's Requesting @Bukhtawar / @backslasht to comment with their opinions regarding this. |
I agree with @parasjain1, given that it is negatively impacting the performance to an extent that the cluster is unusable we should disable it for the time being till the fix is available. I would have preferred to go with the fix if it is straight forward, but considering the complexities involved, it would take few weeks to a month to vet out a clean workable solution and we shouldn't be having this regression for so long. |
Update 1/12/2024: For the time being, a majority of maintainers have voted in favor of keeping the feature as is (not disabling it). The fact that the feature does not work does not represent a good reason to remove it from the code. We have limited anecdotes about this causing issues and so we do not want to set a precedent of removing features which have bugs. We of course hope that this can be fixed but in the meantime will continue to ship with it as is. I am going to close this pull request given this decision but feel free to re-open it/keep discussing on it. This decision was reached by the maintainers as a group but does not mean it is set in stone. |
Just wanted to throw in couple thoughts. It sounds like the rabbit is now out of the hat - but is it still possible to introduce a feature flag and control it? That typically gives us best mechanism to bake & experiment and roll out post confidence. Could we do something like introduce feature flag in addition to above logic? That should atleast provide a workaround for users who run into this until we have the appropriate long term fix? |
Sorry for chiming in late on this.
I agree with this approach adding more checks than version. I think it will also help to control the rollout of this and drive faster development to actually gain more confidence, while helping the customers/operators who may get impacted due to this. The current implementation of feature flags is generally confined to node level, we can modify the logic to rely on the node attributes which are available amongst all the node and can help any node sending a request to take decision whether it should use the custom serialization or jdk one. The value of the node attribute can be populated based on the feature flag. This should help us control the roll-out across versions if needed as well. @krishna-ggk @peternied @scrawfor99 Let me know your thoughts if this is something we can pursue for solving the problem at hand. |
@mgodwan @krishna-ggk Thank you for the thoughts and participation in this discussion. The intention behind the serialization changes were to improve performance for every node-to-node request - a huge success. We know every customer’s needs are important, but we have to think about what’s best for everyone. Right now, turning off this feature would slow things down for a many customers, and we don’t have a good reason to do that yet. We’re committed to providing the best balance for all our customers and have carefully considered this decision. At this time, we will not be pursuing further changes or discussions on feature flag for serialization. Please keep me informed if there are any pull requests to address the root causes and I'll be glad to give them my attention. |
Description
Disables custom serialization for the
2.11.1
patch release.Needs to be backported to
2.x
.Below is the approach.
2.11.1
nodes to -2.11
2.11.0
2.11.1
2.11.1
nodes from -2.11
2.11.0
2.11.1
Related Issues
Testing
BWCs should cover most of the scenarios, while still carry out a manual cluster upgrade test.