Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snow 1853185 jdbc driver v3.16 native okta http retry storm #2064

Conversation

sfc-gh-fpawlowski
Copy link
Contributor

@sfc-gh-fpawlowski sfc-gh-fpawlowski commented Feb 5, 2025

Overview

SNOW-1853185

Pre-review self checklist

  • PR branch is updated with all the changes from master branch
  • The code is correctly formatted (run mvn -P check-style validate)
  • New public API is not unnecessary exposed (run mvn verify and inspect target/japicmp/japicmp.html)
  • The pull request name is prefixed with SNOW-XXXX:
  • Code is in compliance with internal logging requirements

External contributors - please answer these questions before submitting a pull request. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Issue: #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency or upgrading an existing one
    • I am adding new public/protected component not marked with @SnowflakeJdbcInternalApi (note that public/protected methods/fields in classes marked with this annotation are already internal)
  3. Please describe how your code solves the related issue.

The issue with the previous code was that each time we did not succeed in authenticating using OKTA we threw an exception (Authentication timeout). It was caught in the outer loop and all steps were redone.
There were two major issues - described in the Jira task. First of all we did not wait before retrying those steps - because the exception was thrown before sleeping for the backoff time. Second problem was retrying all steps, while usually some of them were unnecessary (especially the first step of the federated auth flow).

To make code more modular the chosen solution included creating 2 classes.

  1. RetryContextManager - controlling the custom logic injected into the retry loop of RestRequest class. Instead of throwing the exception we can now pass custom logic that will be able to update the request to be sent with refreshed data.
  2. RetryContext - containing information about currently ongoing request-retrying process. It is important for example to make the custom logic aware of the time passed since the first request was sent (regarding the overall timeout for the operations).

@sfc-gh-fpawlowski sfc-gh-fpawlowski self-assigned this Feb 5, 2025
@sfc-gh-fpawlowski sfc-gh-fpawlowski marked this pull request as ready for review February 10, 2025 01:07
Copy link
Contributor

@sfc-gh-mkubik sfc-gh-mkubik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the failing StatementLatestIT.testSetQueryTimeoutForAsyncQueryUsingConnectionProperty test somehow related to the changes made in this PR or it's just some flaky test that failed randomly?

@sfc-gh-fpawlowski
Copy link
Contributor Author

Is the failing StatementLatestIT.testSetQueryTimeoutForAsyncQueryUsingConnectionProperty test somehow related to the changes made in this PR or it's just some flaky test that failed randomly?

Seems like there were some temporary issues - I reran the failed jobs and they passed.
Failing Jenkins afaik is caused by the GS change - being fixed here: #2109

@sfc-gh-fpawlowski sfc-gh-fpawlowski merged commit c71218f into master Mar 11, 2025
137 of 140 checks passed
@sfc-gh-fpawlowski sfc-gh-fpawlowski deleted the SNOW-1853185-JDBC-Driver-v3.16-Native-Okta-HTTP-Retry-Storm branch March 11, 2025 22:38
@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants