Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(source-stripe): migrate to low-code cdk #53687

Open
wants to merge 58 commits into
base: master
Choose a base branch
from

Conversation

lazebnyi
Copy link
Collaborator

@lazebnyi lazebnyi commented Feb 14, 2025

What

To migrate source-stripe connector relying on custom python implementation to manifest based using low code cdk and latest CDK features.

Closes airbytehq/airbyte-internal-issues#11544
Closes airbytehq/airbyte-internal-issues#11683
Closes airbytehq/airbyte-internal-issues#11752

How

The connector supports 7 groups of streams, all migrated to a low-code architecture:

  1. Base Incremental Streams

    • Standard incremental streams that track state and support pagination.
  2. State Condition Streams

    • These streams adapt their behavior based on whether a state is set:
    • No State: Runs the base stream as a full sync.
    • With State: Uses an event stream to fetch only relevant events (event types provided via parameters).
  3. Strict State Condition Streams

    • Similar to state condition streams but with stricter handling:
    • No State: Runs the base stream. The endpoints strictly enforce allowed query parameters and return an error for unexpected ones.
    • With State: Uses an event stream to fetch only relevant events (event types provided via parameters).
  4. Base Incremental Partition Streams

    • Incremental streams that support partitioning, designed to handle large datasets more efficiently.
    • These streams rely on state condition streams as parent streams.
  5. Partitioned State Condition Streams

    • Streams with state-aware partitioning behavior:
    • No State: Runs a partition stream associated with the parent state condition stream.
    • With State: Uses the event stream to fetch partitioned event data (event types provided via parameters).
  6. Lazy Read State Condition Streams

    • Streams that support lazy loading to optimize performance:
    • No State: Fetches the first N records from the parent, then switches to normal pagination if additional data is needed.
    • With State: Uses the event stream to fetch only relevant events (event types provided via parameters).
  7. Base Incremental Partition Streams with Lazy Read Logic

    • Combines partitioning and lazy reading for improved performance in large datasets.
    • These streams inherit state condition streams as parent streams and leverage lazy read logic.

Implementation details:

  • Streams relying on state-driven behavior are implemented using StateDelegatingStream, ensuring state management and event-based retrieval.
  • Streams utilizing lazy read logic are powered by LazySimpleRetriever, enabling efficient data fetching when dealing with large payloads.
  • Parent cursor value injection is applied to streams that lack their own cursors in responses — the parent’s cursor value is injected to maintain incremental sync integrity.

User Impact

No

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Copy link

vercel bot commented Feb 14, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 14, 2025 6:24pm

@lazebnyi
Copy link
Collaborator Author

lazebnyi commented Feb 14, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (1e0c4a8)

@lazebnyi
Copy link
Collaborator Author

lazebnyi commented Mar 13, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (d1d69b8)

…om:airbytehq/airbyte into lazebnyi/source-stripe-migrate-to-low-code
@lazebnyi
Copy link
Collaborator Author

lazebnyi commented Mar 14, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (3f56559)

…om:airbytehq/airbyte into lazebnyi/source-stripe-migrate-to-low-code
@lazebnyi
Copy link
Collaborator Author

lazebnyi commented Mar 14, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (2e0e5da)

@lazebnyi
Copy link
Collaborator Author

Regression tests are looking good. Only auto-generated fields are changed - https://github.com/airbytehq/airbyte/actions/runs/13858784655/job/38781757496

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Should we do another round of regression test for incremental syncs? What is the rollout plan?

@@ -211,7 +212,7 @@ def test_given_no_state_when_read_then_use_external_accounts_endpoint(self, http
output = self._read(_config().with_start_date(_A_START_DATE), _NO_STATE)
most_recent_state = output.most_recent_state
assert most_recent_state.stream_descriptor == StreamDescriptor(name=_STREAM_NAME)
assert most_recent_state.stream_state == AirbyteStateBlob(updated=int(_NOW.timestamp()))
assert int(most_recent_state.stream_state.updated) == int(_NOW.timestamp())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that means that the format of the state has changed from int to str? Does that means that we can't revert this change? We should at least make sure to be explicit on Can this PR be safely reverted and rolled back? in the PR description

Copy link
Collaborator Author

@lazebnyi lazebnyi Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that means that the format of the state has changed from int to str?

Yes, the low-code CDK stores states as a string.

Does that means that we can't revert this change?

I don't think so, but I need to double-check to be 100% sure.

Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're 99% there, why not switch to manifest-only completely, leaving manifest.yaml at top level, and keeping unit tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/stripe
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants