Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Destination Redshift: make purgeStagingData optional #8855

Merged
merged 7 commits into from
Dec 17, 2021

Conversation

edgao
Copy link
Contributor

@edgao edgao commented Dec 16, 2021

What

Stacked onto #8607 while I wait for the acceptance test to finish.

Addresses #8552

How

S3StreamCopier now accepts an option to not delete staging data. This means that adding this capability to e.g. Snowflake should be pretty straightforward (or at least, will be once #8820 is done).

Also, I think this gets us a tiny step closer to a happy future where we have some kind of consistent CopyDestinationConfig that gets used across redshift/snowflake/databricks/etc. Something along these lines:

{
  ... insert redshift/snowflake/etc properties...
  "load_strategy": {
    "type": "COPY", // or INSERT
    "purgeStagingData": true,
    "backing_storage": {
      "type": "S3", // GCS, azure blob store, etc
      ... insert S3DestinationConfig properties ...
    }
  }
}

Recommended reading order

  1. S3StreamCopier + factory + test
  2. spec.json + RedshiftCopyS3Destination
  3. Everything else

🚨 User Impact 🚨

nope

Pre-merge Checklist

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Dec 16, 2021
@edgao edgao temporarily deployed to more-secrets December 16, 2021 21:34 Inactive
@edgao
Copy link
Contributor Author

edgao commented Dec 16, 2021

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/1589561219
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/1589561219
Python tests coverage:

	 ---------- coverage: platform linux, python 3.8.10-final-0 -----------
	 Name                                                              Stmts   Miss  Cover
	 -------------------------------------------------------------------------------------
	 main_dev_transform_catalog.py                                         3      3     0%
	 main_dev_transform_config.py                                          3      3     0%
	 normalization/__init__.py                                             4      0   100%
	 normalization/destination_type.py                                    13      0   100%
	 normalization/transform_catalog/__init__.py                           2      0   100%
	 normalization/transform_catalog/catalog_processor.py                143     77    46%
	 normalization/transform_catalog/destination_name_transformer.py     124      6    95%
	 normalization/transform_catalog/reserved_keywords.py                 13      0   100%
	 normalization/transform_catalog/stream_processor.py                 494    313    37%
	 normalization/transform_catalog/table_name_registry.py              174     34    80%
	 normalization/transform_catalog/transform.py                         45     26    42%
	 normalization/transform_catalog/utils.py                             33      7    79%
	 normalization/transform_config/__init__.py                            2      0   100%
	 normalization/transform_config/transform.py                         146     32    78%
	 -------------------------------------------------------------------------------------
	 TOTAL                                                              1199    501    58%

@edgao edgao temporarily deployed to more-secrets December 16, 2021 21:41 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets December 16, 2021 21:42 Inactive
@edgao edgao requested a review from sherifnada December 16, 2021 22:09
@edgao edgao marked this pull request as ready for review December 16, 2021 22:09
@edgao edgao temporarily deployed to more-secrets December 16, 2021 22:59 Inactive
@github-actions github-actions bot added area/frontend area/platform issues related to the platform labels Dec 17, 2021
@edgao edgao temporarily deployed to more-secrets December 17, 2021 00:13 Inactive
@edgao edgao force-pushed the edgao/redshift_no_purge_staging_data branch from 3aadba8 to 30bb078 Compare December 17, 2021 00:18
@github-actions github-actions bot removed area/platform issues related to the platform area/frontend labels Dec 17, 2021
@edgao edgao temporarily deployed to more-secrets December 17, 2021 00:20 Inactive

class S3StreamCopierFactoryTest {

@Nested
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the use of nested in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to indicate that these tests are specific to the Config record, rather than the factory itself. might actually make more sense to just move the record into its own file though?

Base automatically changed from edgao/s3_based_stream_copier to master December 17, 2021 00:41
@edgao edgao force-pushed the edgao/redshift_no_purge_staging_data branch from 30bb078 to be8abd9 Compare December 17, 2021 00:43
@edgao edgao temporarily deployed to more-secrets December 17, 2021 00:45 Inactive
@edgao edgao temporarily deployed to more-secrets December 17, 2021 00:55 Inactive
@edgao
Copy link
Contributor Author

edgao commented Dec 17, 2021

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/1593330144
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/1593330144

@edgao edgao temporarily deployed to more-secrets December 17, 2021 17:22 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets December 17, 2021 17:23 Inactive
@edgao edgao temporarily deployed to more-secrets December 17, 2021 18:51 Inactive
@edgao edgao temporarily deployed to more-secrets December 17, 2021 19:21 Inactive
@edgao edgao merged commit 86e08d0 into master Dec 17, 2021
@edgao edgao deleted the edgao/redshift_no_purge_staging_data branch December 17, 2021 20:42
schlattk pushed a commit to schlattk/airbyte that referenced this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants