Persist since-versions on docgen for stable annotations#5401
Draft
shreyas-goenka wants to merge 1 commit into
Draft
Persist since-versions on docgen for stable annotations#5401shreyas-goenka wants to merge 1 commit into
shreyas-goenka wants to merge 1 commit into
Conversation
`x-since-version` was recomputed from git history on every schema generation, keyed by the Go type path. That made versions drift: when a type is renamed or a shared struct is split into per-resource typed structs (e.g. permissions), the field re-keys and gets re-stamped with a newer version. Persist the computed map as an append-only state file and treat stored entries as authoritative so a recorded version never changes, even across refactors. The state lives on the `docgen` branch (next to jsonschema_for_docs.json), not in the main source tree: - since_version.go reads/writes the state at the path in DATABRICKS_SINCE_VERSIONS_FILE. When unset (local `task generate`, regular CI) behavior is unchanged: versions are computed from history and nothing is persisted, so main stays clean and the workflow's "only the docs schema changed" assertion still holds. - When set, computeSinceVersions loads the stored map, refreshes it from git history to discover new fields, merges with stored entries winning, and writes it back. sinceVersionAliases lets a renamed/retyped field inherit its previous key's version. - The update-schema-docs workflow checks out docgen, points the env var at its since_versions.json, regenerates, and commits both the schema and the refreshed state back to docgen. It runs on every release tag, so the state is updated on each release; the first run seeds it from history. The map is written deterministically (sorted, trailing newline) so it stays diff-stable. No existing annotation changes; this only prevents future drift. Co-authored-by: Isaac
b243a6e to
2a3b079
Compare
Contributor
|
An authorized user can trigger integration tests manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
x-since-versionis recomputed from git history on every schema generation (computeSinceVersions), keyed by the Go type path (typePath.fieldName). For common/shared types like permissions that means versions are unstable:Permission→JobPermission/AppPermission/… migration), the field re-keys and gets re-stamped with the version where the new type first appeared (e.g.apps.permissionspredatesAppPermission).Fix
Persist the computed map as an append-only state file and treat stored entries as authoritative — a recorded version never changes, even across refactors.
The state lives on the
docgenbranch (next tojsonschema_for_docs.json), not in the main source tree:since_version.goreads/writes the state at the path inDATABRICKS_SINCE_VERSIONS_FILE. When unset (localtask generate, regular CI), behavior is unchanged — versions are computed from history and nothing is persisted, so main stays clean and the workflow's "only the docs schema changed" assertion still holds.computeSinceVersionsloads the stored map, refreshes it from git history to discover new fields, merges with stored entries winning, and writes it back.sinceVersionAliaseslets a renamed/retyped field inherit its previous key's version.update-schema-docsworkflow checks outdocgen, points the env var at itsbundle/schema/since_versions.json, regenerates, and commits both the schema and the refreshed state back todocgen. It runs on every release tag, so the state is updated on each release; the first run seeds it from history.The map is written deterministically (sorted, trailing newline) so it stays diff-stable. No existing annotation changes — this only prevents future drift.
Why docgen (not main)
The state is an internal, machine-managed artifact; keeping it off main avoids cluttering the source tree and avoids tripping the workflow's "assert only
jsonschema_for_docs.jsonchanged on main" guard. It sits beside the published docs schema it annotates.Notes / follow-ups
Test
go build/go vet/gofmtclean.--docsmode withDATABRICKS_SINCE_VERSIONS_FILEpointing at a seeded state file — a frozen stored version won in the output schema and the state was written back.This pull request and its description were written by Isaac.