Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Source-LinkedIn-Ads: Performance improvements for Campaign Analytics Streams #55747

Closed

Conversation

onurmus
Copy link

@onurmus onurmus commented Mar 13, 2025

What

When working with Ad Campaign Analytics streams, the connector sends an excessive number of requests to LinkedIn API. Some of them are unnecessary as we know that the API will not return any data beforehand. The reason is that the connector creates the same slices for all campaigns fetched incrementally. However, some of the campaigns are already COMPLETED or PAUSED, etc.

I have already created an issue for this: Slow Performance of Analytics Streams. You can find more details in the issue.

How

As this connector uses PerPartitionCursor, I extended this class and passed some extra information about campaigns to the DatatimeBasedCursor. With this information, extended DatatimeBasedCursor class filters slices it creates.

To do so, I extended the partition routers of Ad Campaign Analytics streams in metadata.yaml so we get status, runSchedule and lastModified extra fields from parent. I created AnalyticsPerPartitionCursor cursor that passes extra information to the CampaignAnalyticsDatetimeBasedCursor. Then, this cursor uses the information while generating slices.

I have also changed the state structure for the Ad Campaign Analytics streams. With the new structure, states will also keep the information about the latest values of status, lastModified and runschedule for campaigns. This information will be used in the next sync to decide on slices.

Review guide

Please check that the logic for filtering slices for campaigns is correctly defined in CampaignAnalyticsDatetimeBasedCursor.stream_slices.

User Impact

It will shorten the sync duration for Ad Campaign Analytics streams. For instance, we have over 1000 campaigns in our LinkedIn Ads account after 2024-01-01. Previously incremental sync time for any Ad Campaign Analytics was over 4 hours. Currently, it's around 40-45 minutes.

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Copy link

vercel bot commented Mar 13, 2025

@onurmus is attempting to deploy a commit to the Airbyte Growth Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Mar 13, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@marcosmarxm marcosmarxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onurmus can you allow maintainers to edit your branch? In addition to that please sign the CLA to continue the review process.

@natikgadzhi
Copy link
Contributor

@agarctfi take this for a spin plesae

@onurmus
Copy link
Author

onurmus commented Mar 19, 2025

@onurmus can you allow maintainers to edit your branch? In addition to that please sign the CLA to continue the review process.

@marcosmarxm I signed the CLA. The source branch is located in my company's fork and company policies prevent me to allow edit access outside the organization. I'll open the same PR using my personal fork.

@marcosmarxm
Copy link
Member

Closing waiting for new contribution from user fork branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

5 participants