Skip to content

Feature/schema ownership#5806

Draft
gabepesco wants to merge 7 commits into
SQLMesh:mainfrom
gabepesco:feature/schema-ownership
Draft

Feature/schema ownership#5806
gabepesco wants to merge 7 commits into
SQLMesh:mainfrom
gabepesco:feature/schema-ownership

Conversation

@gabepesco

@gabepesco gabepesco commented May 21, 2026

Copy link
Copy Markdown
Contributor

Description

SQLMesh creates schemas, views, and tables owned by whichever user/SPN identity executes the plan. The after_all hook is the only existing correction point, but it only fires on successful completion — a failed or partial run leaves objects owned by the executing principal. Two cleanup paths are affected:

  • Virtual layer (sushi__dev_alice.*): the janitor drops dev environment schemas on expiry. Unity Catalog rejects the DROP if the janitor doesn't own the schema.
  • Physical layer (sqlmesh__sushi.*): the janitor drops orphaned versioned tables after snapshot_ttl (default 1 week). Same ownership problem.

This PR adds an ownership config block that sets owner principals at object-creation time, so even a partially-completed run leaves objects in a manageable state.

Config API:

ownership:
  environment_owner_mapping:
    "^prod$": "svc_prod_spn"
    ".*": "group:shared-developers"
  physical_owner: "group:data-platform"

How it works:

  • OwnershipConfig maps environment name regex patterns to owner principals for the virtual layer (first match wins, same style as environment_catalog_mapping), plus a plain physical_owner string for the physical layer.
  • Virtual layer: the resolved owner is passed into SnapshotEvaluator.promote() and applied immediately after each CREATE SCHEMA and CREATE OR REPLACE VIEW.
  • Physical layer: physical_owner is passed into SnapshotEvaluator.create() and create_physical_schemas(), applied after each physical table and schema is created. ViewKind snapshots are skipped (their ownership is managed by the virtual layer).
  • alter_schema_owner(), alter_view_owner(), and alter_table_owner() are no-ops on the base adapter. SparkEngineAdapter (covering Databricks Unity Catalog) issues the corresponding ALTER ... OWNER TO ... DDL with properly backtick-quoted principals, handling names that contain : and @ (Unity Catalog group format).

Test plan:

  • Unit tests for OwnershipConfig covering resolve_owner() pattern matching, first-match-wins ordering, case sensitivity, dict deserialization, update_with merge behavior, physical_owner field, and defaults.
  • Mocked-adapter tests for SnapshotEvaluator.promote() verifying alter_schema_owner and alter_view_owner are called correctly when owner is set and skipped when omitted.
  • Mocked-adapter tests for SnapshotEvaluator.create() and create_physical_schemas() verifying alter_table_owner and alter_schema_owner fire for non-view tables and are suppressed for ViewKind snapshots and when no owner is configured.
  • SQL generation tests for SparkEngineAdapter verifying correct backtick-quoted ALTER SCHEMA/VIEW/TABLE ... OWNER TO ... DDL, including principals with : and @. Includes a base adapter no-op test confirming DuckDBEngineAdapter emits no OWNER-related SQL.
  • Integration smoke test (@pytest.mark.slow) running a full prod + dev plan/apply cycle with ownership config against DuckDB, verifying no errors and correct schema creation.

Note: the ALTER ... OWNER TO contract on a live Databricks Unity Catalog connection is not tested here — the SQL generation tests verify the correct DDL is produced.

Checklist:

  • I have run make style and fixed any issues
  • I have added tests for my changes (if applicable)
  • All existing tests pass (make fast-test)
  • My commits are signed off (git commit -s) per the DCO
Gabe Pesco and others added 7 commits May 21, 2026 12:14
Signed-off-by: Gabe Pesco <PescoG@medinsight.milliman.com>
Add `physical_owner` field to `OwnershipConfig` so that SQLMesh__*
physical tables get ownership applied at creation time, not just
virtual-layer views and schemas.

* `OwnershipConfig.physical_owner` - optional plain string, no env-pattern matching needed
* `EngineAdapterBase.alter_table_owner` - no-op default
* `SparkEngineAdapter.alter_table_owner` - ALTER TABLE ... OWNER TO ... with backtick-quoting
* `SnapshotEvaluator.create_snapshot` - calls alter_table_owner after _execute_create (skips ViewKind)
* `SnapshotEvaluator.create` / `create_physical_schemas` - accept and thread owner param
* `BuiltInPlanEvaluator` - resolves physical_owner in both PhysicalLayerSchemaCreation and PhysicalLayerUpdate stages

Signed-off-by: Gabe Pesco <gabe.pesco@milliman.com>
Signed-off-by: Gabe Pesco <PescoG@medinsight.milliman.com>
Run ruff-format on changed files. Remove DuckDB attach assertions
accidentally included in test_config_ownership_defaults_to_empty —
those are already covered by test_load_duckdb_attach_config.

Signed-off-by: Gabe Pesco <gabe.pesco@milliman.com>
Signed-off-by: Gabe Pesco <PescoG@medinsight.milliman.com>
Signed-off-by: Gabe Pesco <PescoG@medinsight.milliman.com>
- OwnershipConfig gains environment_owner_resolver and
  physical_owner_resolver callable fields for cases where the owner
  principal must be resolved at plan execution time (e.g. dynamic SPN
  identity via adapter.current_user()).
- Callable resolvers take precedence over the static mapping/string
  fields when both are set.
- resolve_owner() and resolve_physical_owner() now accept the active
  EngineAdapter so callables can query the connection.
- Add current_user() to the base EngineAdapter (SELECT CURRENT_USER()).
- Replace the getattr/isinstance guard in BuiltInSchedulerConfig with
  OwnershipConfig.is_active.
- Update all resolve_owner/resolve_physical_owner call sites in
  BuiltInPlanEvaluator to thread the adapter through.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant