Enterprise capability

Data Loads

Ingest source data into governed master entities using SQL-driven definitions, merge logic, and fully traceable execution metadata.

What Data Loads are for

  • Controlled source onboarding from SQL-accessible systems into master entities.
  • Repeatable load definitions stored as YAML and executed with explicit load IDs.
  • Safer merge behavior with key-based matching, update policies, and optional soft-delete handling for missing source rows.

How it works in practice

StepActionOutcome
1. DefineSet query SQL, identity columns, and update policy in YAML.Clear and versionable load contract.
2. PreviewRun query preview before execution.Source shape and values are validated early.
3. RunExecute insert/update merge with run tracking.Inserted/updated/skipped/error counts per load.
4. TraceInspect run errors and lineage fields (source/load ID).Audit-ready operational visibility.

Key capabilities

  • Merge identity mapping based on entity columns (logical or display names).
  • Column-level overwrite control: always overwrite, never overwrite, or overwrite only when target is null.
  • Null handling policy to either ignore or apply nulls from source.
  • Missing-row policy to ignore or soft-delete records absent from the source snapshot.
  • Approval integration with optional bypass-to-approved mode for governed entities.

When teams use Data Loads first

  • Migrating reference entities from legacy operational databases.
  • Creating repeatable nightly synchronization of customer/supplier dimensions.
  • Standardizing source-to-master enrichment before analytics and downstream integrations.

See integration patterns | See full feature map | Back to Learn center