Enterprise capability
Data Loads
Ingest source data into governed master entities using SQL-driven definitions, merge logic, and fully traceable execution metadata.
What Data Loads are for
- Controlled source onboarding from SQL-accessible systems into master entities.
- Repeatable load definitions stored as YAML and executed with explicit load IDs.
- Safer merge behavior with key-based matching, update policies, and optional soft-delete handling for missing source rows.
How it works in practice
| Step | Action | Outcome |
|---|---|---|
| 1. Define | Set query SQL, identity columns, and update policy in YAML. | Clear and versionable load contract. |
| 2. Preview | Run query preview before execution. | Source shape and values are validated early. |
| 3. Run | Execute insert/update merge with run tracking. | Inserted/updated/skipped/error counts per load. |
| 4. Trace | Inspect run errors and lineage fields (source/load ID). | Audit-ready operational visibility. |
Key capabilities
- Merge identity mapping based on entity columns (logical or display names).
- Column-level overwrite control: always overwrite, never overwrite, or overwrite only when target is null.
- Null handling policy to either ignore or apply nulls from source.
- Missing-row policy to ignore or soft-delete records absent from the source snapshot.
- Approval integration with optional bypass-to-approved mode for governed entities.
When teams use Data Loads first
- Migrating reference entities from legacy operational databases.
- Creating repeatable nightly synchronization of customer/supplier dimensions.
- Standardizing source-to-master enrichment before analytics and downstream integrations.
See integration patterns | See full feature map | Back to Learn center