
## Metadata
- Author:: [[shirshanka-das|Shirshanka Das]]
- Full Title:: Data Contracts Wrapped 🎁 2022
- Category:: #🗞️Articles, [[Data contracts|Data contracts]]
- URL:: https://blog.datahubproject.io/data-contracts-wrapped-2022-470e0c43365d
- Finished date:: [[2022-12-25]]
## Highlights
- A data contract is:
1. a data quality specification for a dataset that describes
• schema of the dataset
• simple column-oriented quality rules (is_not_null, values_between etc.)
2. anchored around a dataset which is typically a physical event stream produced by a team / domain
3️. also a carrier for semantic aka business metadata
• ownership
• classification tags
4️. can also describe operational SLO-s
• freshness goals (e.g. must be available for processing by 7am in the warehouse)
5️. can also be a specifier for provisioning configuration for a dataset (e.g. provision this dataset on Kafka and BigQuery) ([View Highlight](https://read.readwise.io/read/01gmw6f59td3grm71d8cpbjha4))
- whether there is a specific reason the community is hyper-focused on attaching data contracts only to this edge. One of the reasons this has happened is that a lot of the intention behind the conversation around data contracts is to shift responsibility and accountability for data to the application teams that produce it ([View Highlight](https://read.readwise.io/read/01gmw6gvxnqs932xb4e7r34d29))
- Use a git-based process for creating and managing the lifecycle of a data contract ([View Highlight](https://read.readwise.io/read/01gmw6sd8pec6nk9t5pzthy3m6))
- shift-left approaches to metadata ([View Highlight](https://read.readwise.io/read/01gmw6tvn4tw34c3krfd0xtcsh))
- Note: The practice of filling metadata as close to the source as possible
- Similar techniques for batch datasets exist using the staging → publish table pattern ([View Highlight](https://read.readwise.io/read/01gmw6xw7y23t1h748s937fcth))
- • Data producers should own the specification of the contract ([View Highlight](https://read.readwise.io/read/01gmw6zr5fmxcktrh0qskk7x9t))
- • They should be accountable for all the characteristics of the dataset as laid out in the contract ([View Highlight](https://read.readwise.io/read/01gmw6zw4zeh591brgap39zg8d))
- Data Contracts are currently closely aligned to the Source Aligned Data Products as laid out by Zhamak. ([View Highlight](https://read.readwise.io/read/01gmw71fj4v0qh8n0rtcycwykp))
- Note: [[Data mesh|Data mesh]]
