![rw-book-cover](https://miro.medium.com/max/1200/1*sUE71_GMQheL0zRnSaMl_w.png) ## Metadata - Author:: [[shirshanka-das|Shirshanka Das]] - Full Title:: Data Contracts Wrapped 🎁 2022 - Category:: #🗞️Articles, [[Data contracts|Data contracts]] - URL:: https://blog.datahubproject.io/data-contracts-wrapped-2022-470e0c43365d - Finished date:: [[2022-12-25]] ## Highlights - A data contract is: 1. a data quality specification for a dataset that describes • schema of the dataset • simple column-oriented quality rules (is_not_null, values_between etc.) 2. anchored around a dataset which is typically a physical event stream produced by a team / domain 3️. also a carrier for semantic aka business metadata • ownership • classification tags 4️. can also describe operational SLO-s • freshness goals (e.g. must be available for processing by 7am in the warehouse) 5️. can also be a specifier for provisioning configuration for a dataset (e.g. provision this dataset on Kafka and BigQuery) ([View Highlight](https://read.readwise.io/read/01gmw6f59td3grm71d8cpbjha4)) - whether there is a specific reason the community is hyper-focused on attaching data contracts only to this edge. One of the reasons this has happened is that a lot of the intention behind the conversation around data contracts is to shift responsibility and accountability for data to the application teams that produce it ([View Highlight](https://read.readwise.io/read/01gmw6gvxnqs932xb4e7r34d29)) - Use a git-based process for creating and managing the lifecycle of a data contract ([View Highlight](https://read.readwise.io/read/01gmw6sd8pec6nk9t5pzthy3m6)) - shift-left approaches to metadata ([View Highlight](https://read.readwise.io/read/01gmw6tvn4tw34c3krfd0xtcsh)) - Note: The practice of filling metadata as close to the source as possible - Similar techniques for batch datasets exist using the staging → publish table pattern ([View Highlight](https://read.readwise.io/read/01gmw6xw7y23t1h748s937fcth)) - • Data producers should own the specification of the contract ([View Highlight](https://read.readwise.io/read/01gmw6zr5fmxcktrh0qskk7x9t)) - • They should be accountable for all the characteristics of the dataset as laid out in the contract ([View Highlight](https://read.readwise.io/read/01gmw6zw4zeh591brgap39zg8d)) - Data Contracts are currently closely aligned to the Source Aligned Data Products as laid out by Zhamak. ([View Highlight](https://read.readwise.io/read/01gmw71fj4v0qh8n0rtcycwykp)) - Note: [[Data mesh|Data mesh]] ![rw-book-cover](https://miro.medium.com/max/1200/1*sUE71_GMQheL0zRnSaMl_w.png)