You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yesterday we paused all of our MySQL to S3 connections for two hours. When we resumed them, we found that each of them immediately started to clear all of the data for each of the streams in our S3 buckets. Obviously, this has been painful to recover from and we are very interested in preventing this from happening again.
The following are logs from the server pod at the time that one of the connections was reenabled:
i.a.c.t.ConnectionManagerUtils(signalWorkflowAndRepairIfNecessary):95 - Retrieved existing connection manager workflow for connection 8f70b590-876f-422c-b39d-9c41767a8f52. Executing signal.
i.a.c.s.h.SchedulerHandler(createJob):600 - Found the following streams to reset for connection 8f70b590-876f-422c-b39d-9c41767a8f52: [io.airbyte.config.StreamDescriptor@3103115f[namespace=public,name=log,additionalProperties={}]]
i.a.c.h.ResourceRequirementsUtils(getResourceRequirementsForJobType):64 - Merged resource requirements. mergedResourceReqs=io.airbyte.config.ResourceRequirements@46983c1[cpuRequest=250m,cpuLimit=2,memoryRequest=2Gi,memoryLimit=2Gi,ephemeralStorageRequest=<null>,ephemeralStorageLimit=<null>,additionalProperties={}] connectionResourceReqs=null actorResourceReqs=null actorDefinitionResourceReqs=io.airbyte.config.ScopedResourceRequirements@78195c45[_default=<null>,jobSpecific=[io.airbyte.config.JobTypeResourceLimit@1bd523fd[jobType=sync,resourceRequirements=io.airbyte.config.ResourceRequirements@5040cf73[cpuRequest=<null>,cpuLimit=<null>,memoryRequest=2Gi,memoryLimit=2Gi,ephemeralStorageRequest=<null>,ephemeralStorageLimit=<null>,additionalProperties={}],additionalProperties={}]],additionalProperties={}] workerDefaultResourceReqs=io.airbyte.config.ResourceRequirements@18ff2931[cpuRequest=250m,cpuLimit=2,memoryRequest=512Mi,memoryLimit=8Gi,ephemeralStorageRequest=<null>,ephemeralStorageLimit=<null>,additionalProperties={}] jobType=sync
i.a.p.j.DefaultJobPersistence(enqueueJob):589 - enqueuing pending job for scope: 8f70b590-876f-422c-b39d-9c41767a8f52
i.a.c.s.h.ConnectionsHandler(applySchemaChange):1368 - Applying schema change for connection '8f70b590-876f-422c-b39d-9c41767a8f52' only
.a.c.s.h.ConnectionsHandler(applySchemaChange):1414 - Sending notification of manually applying schema change for connectionId: '8f70b590-876f-422c-b39d-9c41767a8f52'
What these appear to show is that Airbyte detected a schema change triggering the data wipe. However, according to your documentation, this should not have occurred as we have approve all changes myself as the setting. No dialog box opened asking us to confirm this action, Airbtye simply took it upon itself to drop a few terrabytes of data.
The screenshot shows the timeline for one of our connections. As can be seen, it was disabled, reenabled, cleared the data, then synced.
We are using:
Airbyte 1.5.0 OSS
MySQL v3.7.1
S3 v0.6.1
Was this a bug? Did we have something incorrectly set? How are we supposed to pause this type of connection without risking massive data loss in the future?
Thanks,
Euan
The text was updated successfully, but these errors were encountered:
Topic
Data loss
Relevant information
Yesterday we paused all of our MySQL to S3 connections for two hours. When we resumed them, we found that each of them immediately started to clear all of the data for each of the streams in our S3 buckets. Obviously, this has been painful to recover from and we are very interested in preventing this from happening again.
The following are logs from the server pod at the time that one of the connections was reenabled:
What these appear to show is that Airbyte detected a schema change triggering the data wipe. However, according to your documentation, this should not have occurred as we have
approve all changes myself
as the setting. No dialog box opened asking us to confirm this action, Airbtye simply took it upon itself to drop a few terrabytes of data.The screenshot shows the timeline for one of our connections. As can be seen, it was disabled, reenabled, cleared the data, then synced.
We are using:
Was this a bug? Did we have something incorrectly set? How are we supposed to pause this type of connection without risking massive data loss in the future?
Thanks,
Euan
The text was updated successfully, but these errors were encountered: