feat: cache iOS runner artifacts during prepare#688
Merged
Conversation
Size Report
Startup median (7 runs, lower is better):
Top changed chunks:
|
6e6a593 to
73713ab
Compare
thymikee
commented
Jun 4, 2026
Member
Author
thymikee
left a comment
There was a problem hiding this comment.
Review: prepare-owned Apple runner cache
Reviewed for regressions, perf, data-flow, platform parity, and API usage. This is a strong refactor — sharing the high-signal points.
What works well
- Cleaner data flow / decomposition.
runner-client.tsshrinks from ~611 lines to a thin facade; orchestration moves intorunner-lifecycle.ts, transport-error recovery intorunner-command-recovery.ts, and platform policy intoapple-runner-platform.ts. TheAppleRunnerProvidercontract (runCommand+ optionalprepare/prewarm) is a genuinely simpler seam — provider-backed scenarios can ownprepareinstead of faking anuptimecommand. - Platform parity. The table-driven
RUNNER_PLATFORM_PROFILES(iOS/tvOS/macOS) plus theisApplePlatformgate in the daemonpreparehandler makes future iPad/watchOS-style additions additive instead of conditional sprawl. The movedresolveRunner*helpers look behavior-equivalent to the originals. - Correct replay safety. The recovery path correctly distinguishes before-send failures (
Runner did not accept connection, readiness-preflight) — which it safely replays on a fresh session — from after-send transport loss, which goes throughstatus-probe recovery and never blindly replays a mutating command. That distinction is the crux of correctness here and it's preserved. - Cache correctness win. Folding Xcode/SDK fingerprints into the cache metadata means cached
.xctestrunartifacts now invalidate across toolchain upgrades — a real bug class avoided.
Worth addressing
failureReasonsemantics on a successful recovery (highest signal). InprepareLocalIosRunner, the bad-cache→rebuild→health-check-passes path returns a result that still carriesfailureReason: reason,recordPrepareResultthen logs it atwarn, and the daemon handler spreads...resultinto the--jsonresponse. So a prepare that ultimately succeeded surfaces afailureReasonand a warn-level log. CI/agents parsing the JSON could read that as a failure. Consider renaming torecoveredFromReason/recoveryReason(or omitting it on success) to keep "prepare succeeded" unambiguous.resolveExpectedRunnerCacheMetadatanow spawns subprocesses. It shells out toxcodebuild -versionand twoxcruncalls. It's memoized per-process viaappleToolFingerprintCache, which is the right call — but it converts a previously-pure metadata function into one doing blocking subprocess I/O on first use. Worth confirming this is never invoked on a latency-sensitive per-command path (it looks build/cache-key scoped, which is fine).- Schema version not bumped.
RUNNER_CACHE_SCHEMA_VERSIONstays1despite five new metadata fields. It still works (full-JSONcomparableRunnerCacheMetadataequality + the hash-derived cache key both naturally invalidate stale entries), but bumping it would make the invalidation intent explicit and avoid confusing "same schemaVersion, different shape" metadata in logs. - Order-sensitive metadata equality duplicated across two files. Reuse hinges on
JSON.stringify(comparableRunnerCacheMetadata(...))equality, and the derived cache key is a hash of the same. The CI.mjs(write-xcuitest-cache-metadata.mjs) must emit byte-identical key order, and this PR widens that implicit contract by 5 fields across both files. They currently match, but a sorted-key canonicalization would make this robust against future reordering. - Vestigial
clean-derived: "1". Withbuild-on-miss: "false", that input now only feeds the skipped "Build replay artifacts" step (env is scoped there, not job-level), so it's effectively dead inios.yml/replays-nightly.yml. Dropping it would avoid implying the prepare step cleans the restored cache (it correctly doesn't —AGENT_DEVICE_IOS_RUNNER_DERIVED_PATHis job-level andAGENT_DEVICE_IOS_CLEAN_DERIVEDis not, so the restore→reuse flow is intact).
Nothing here looks blocking; the failureReason-on-success naming is the one I'd most want resolved before merge.
Generated by Claude Code
Member
Author
|
Addressed the review feedback in
Validation:
|
952f43e to
5807622
Compare
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
prepare ios-runnernow owns Apple runner artifact restore/build/health behavior: cache hits launch from the restored.xctestrun, misses build through prepare, and bad restored artifacts are marked, cleaned, rebuilt once, and health-checked again.The runner path is now organized around an Apple runner runtime/provider contract with
runCommand, optionalprepare, and optional best-effortprewarm. Local XCUITest execution implements that runtime, while provider-backed integration scenarios can own prepare directly without reducing it to a rawuptimecommand.Apple platform policy moved out of the xctestrun artifact module into
apple-runner-platform.ts, so SDK names, derived-cache folders, xctestrun hints, and destination strings are table-driven for iOS, tvOS targets, and macOS. That keeps future iPad/watchOS-style expansion additive instead of spreading conditionals through artifact/cache code.The iOS CI and nightly replay workflows dogfood this path by restoring/saving runner cache and running
prepare ios-runner --timeout 300000 --jsonas the single build-or-health gate.Touched files: 23. Scope expanded from runner runtime to CI workflow/action wiring so this repo uses the new prepare-owned cache path.
Validation
pnpm check:quickpassed after the final refactor.Focused provider/runner coverage passed:
pnpm exec vitest run src/platforms/ios/__tests__/runner-provider.test.ts src/platforms/ios/__tests__/runner-client.test.ts src/platforms/ios/__tests__/runner-command-retry.test.ts test/integration/provider-scenarios/macos-desktop.test.ts test/integration/provider-scenarios/ios-lifecycle.test.ts.pnpm build:xcuitestpassed for iOS and macOS runner builds.pnpm check:unitpassed outside the sandbox. The sandboxed attempt failed on unrelated local listener/process restrictions such aslisten EPERM, so it was rerun with escalation.pnpm formatwas run; unrelated existing formatter churn was reverted so this diff stays scoped.Earlier PR validation also covered fallow, provider integration, smoke, and coverage for the cache/prepare implementation.