Skip to content

Fix CloudKit data loss on delete-then-reinsert#421

Open
jsutula wants to merge 2 commits into
pointfreeco:mainfrom
jsutula:fix-data-loss-on-delete-then-reinsert
Open

Fix CloudKit data loss on delete-then-reinsert#421
jsutula wants to merge 2 commits into
pointfreeco:mainfrom
jsutula:fix-data-loss-on-delete-then-reinsert

Conversation

@jsutula
Copy link
Copy Markdown
Contributor

@jsutula jsutula commented Mar 13, 2026

Fixes #418

Root cause

Reinsertion of a record after a delete does not produce a new pending .saveRecord change entry due to missing undelete/reinsert state detection in PrimaryKeyedTable.afterInsert trigger. No new .saveRecord means that only the .deleteRecord is propagated.

Solution

To fulfill the expected behavior of reinsertion resulting in a .saveRecord and assigning fresh timestamps to all fields, several changes are needed: one prerequisite change to test infrastructure MockSyncEngineState, and a four-part change to detect and handle reinsertion properly.

Prerequisite change necessary for writing accurate tests: Update MockSyncEngineState to deduplicate changes across types like the real CKSyncEngine.State

From CKSyncEngine.State.add(pendingRecordZoneChanges:) documentation:

Note
The order in which you apply record zone changes is important.
For example:

  • If you add .saveRecord(recordA) then .deleteRecord(recordA), the sync engine
    discards the save and sends only the delete change.
  • If you add .deleteRecord(recordA) then .saveRecord(recordA), the sync engine
    discards the delete and sends only the save change.

MockSyncEngineState.add(pendingRecordZoneChanges:) deduplicates fully identical changes (type+ID) by virtue of the underlying OrderedSet, but does not deduplicate when the type (save vs delete) is different, as is described above with CKSyncEngine.State. MockSyncEngineState.add(pendingDatabaseChanges:) has a similar issue. The fix: update both to deduplicate based on ID alone which will allow for the cross-type deduplication described above, and add tests.

Changes to detect and handle reinsertion:

1. Replace SyncMetadata._isDeleted with enum value SyncMetadata._pendingStatus

The _isDeleted field does not provide enough information about the current pending/transition status of the record. Introduce new enum value field _pendingStatus to replace _isDeleted with two cases: .deleted (soft-deleted) and .reinserted (row re-added while soft-deleted).

2. Detect reinsertion of soft-deleted record, update _pendingStatus and queue new .saveRecord change

Modify PrimaryKeyedTable.afterInsert trigger to update the SyncMetadata record when reinsertion of a soft-deleted row is detected (insert comes through while _pendingStatus is .deleted). Updates include:

  • Setting _pendingStatus to .reinserted
    • This status will be used to trigger full re-hydration of _lastKnownServerRecordAllFields to be based on local, reinserted state when preparing an outbound record (see #3 below) and before processing inbound server record (see #4 below).
  • Updating userModificationTime
    • Will be used as the modified time for all fields in the reinserted row when hydrating _lastKnownServerRecordAllFields

Hook up new SyncMetadata.afterReinsertTrigger to listen for reinsertions. This trigger is responsible for queuing a .saveRecord change via syncEngine.$didUpdate (the same as what happens after a normal insert or update).

3. In nextRecordZoneChangeBatch (outbound record preparation), ensure record is built with reinserted row values

When preparing an outbound record for a .reinserted row, use SyncMetadata.lastKnownServerRecord (system fields only) as the base for building the record. Using lastKnownServerRecord as the base (vs _lastKnownServerRecordAllFields which would contain stale user field data) ensures the resulting all fields record is built up entirely from the current contents of the reinserted row + reinsert userModificationTime.

Additionally, as a cleanup step, upon saving the newly built record back into the metadata row, reset metadata _pendingStatus to nil. This ensures this "rebuild record from reinserted row" behavior will occur only once per reinsertion.

4. In upsertFromServerRecord (inbound record processing), ensure local allFields state reflects reinserted row values prior to server record application

Since upsertFromServerRecord can occur before nextRecordZoneChangeBatch, we also need to ensure that the local metadata 'allFields' state is rebuilt and accurate when we encounter any .reinserted rows in this code path.

So before reconciling local metadata and user row with server record data, check if we're dealing with a .reinserted row, and if so, update the local allFields record to reflect all reinserted row values + reinsert userModificationTime. And similarly to the outbound path, clean up by setting _pendingStatus to nil at the conclusion of the upsert.

Testing

In addition to new unit tests, performed a manual test with 9ea5bdd applied to verify that the data loss issue no longer occurs. Result (compare with before):

Screen.Recording.2026-03-13.at.12.32.44.AM.mov

@jsutula jsutula force-pushed the fix-data-loss-on-delete-then-reinsert branch from 332871e to 1f5150a Compare March 13, 2026 02:17
@jsutula jsutula changed the title Fix CloudKit data loss on delete-then-reinsert (#418) Fix CloudKit data loss on delete-then-reinsert Mar 13, 2026
@jsutula jsutula force-pushed the fix-data-loss-on-delete-then-reinsert branch from 1f5150a to bdd291d Compare March 13, 2026 08:20
Comment thread Sources/SQLiteData/CloudKit/Internal/MockSyncEngine.swift
@jsutula
Copy link
Copy Markdown
Contributor Author

jsutula commented Mar 16, 2026

I've had a few rounds of refinement to make this a cleaner and more correct fix.
With the latest change to assign fresh timestamps to all fields on reinsertion, this should be fully ready for review now. Description is up to date.

.update {
$0._isDeleted = false
$0.userModificationTime = $currentTime()
$0._lastKnownServerRecordAllFields = #bind(nil)
Copy link
Copy Markdown
Contributor Author

@jsutula jsutula Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One note here: clearing _lastKnownServerRecordAllFields may be too blunt an instrument to get the desired result of fresh timestamps for every field. It does guarantee that the next set of changes sent to the server will have fresh timestamps for all fields, but I believe also introduces a window of time (after reinsert but before successful submission of changes) where concurrent modifications from another device could be fetched and applied with no delta-filtering baseline, potentially overwriting locally-reinserted values.

A more direct solution may call for a separate field on SyncMetadata, something like _isReinserted: Bool. Or better yet: roll up _isDeleted and _isReinserted as cases in a new optional enum field PendingStatus. The status being reinserted could then be used to control outbound change inclusion behavior (include changes for all fields, each modified at the row's userModificationTime) and inbound filtering behavior (only take server field value if its modified time is greater than or equal to local row's userModificationTime). I'll wait for feedback before pursuing this in case there may be other options.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I went with this approach laid out here in latest revision 08adaa0. PR description updated.

@aehlke
Copy link
Copy Markdown

aehlke commented May 9, 2026

This sounds like a pretty critical issue - easy to trigger and results in data loss

@nagra
Copy link
Copy Markdown

nagra commented May 31, 2026

Any updates on this? I'm pinned against this commit at the moment 😬

@jsutula
Copy link
Copy Markdown
Contributor Author

jsutula commented Jun 1, 2026

I haven't received feedback on this approach from Brandon or Stephen, but I know it's on their radar. This branch is overdue for a rebase which I'll do now.

@nagra Since this change modifies SyncMetadata schema (adds _pendingStatus, removes _isDeleted), be very cautious pinning against this for anything more than local development.

For example, if you’ve already pinned to this and distributed widely via TestFlight / App Store, but the final version of this fix ends up not needing the schema change, then in order to adopt the official fix you would have to spin up a fork + custom patch to ensure that _isDeleted is re-added if it was previously removed. And you’d potentially have to maintain this fork for a while - until you’re sure that all existing app installations have upgraded.

@jsutula jsutula force-pushed the fix-data-loss-on-delete-then-reinsert branch from 08adaa0 to 399663a Compare June 1, 2026 03:58
Comment on lines -144 to -150
try #sql(
"""
ALTER TABLE "\(raw: .sqliteDataCloudKitSchemaName)_metadata"
DROP COLUMN "_isDeleted"
"""
)
.execute(db)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nagra - after my rebase, I pushed this additional commit so that the old _isDeleted column is no longer dropped. This would make it marginally safer to pin to this commit and deploy, but I still wouldn't recommend it since the risk of schema divergence still remains, and it'd be a pain to clean that up in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CloudKit data loss after deleting then re-inserting record with the same UUID

3 participants