Dr. Mario's 2nd 🧠

❯

❯

❯

Surrogate Keys in Dbt Integers or Hashes

Surrogate Keys in Dbt Integers or Hashes

2 min read

Metadata

Author: Dave Connors
Full Title:: Surrogate Keys in Dbt: Integers or Hashes?
Category:: 🗞️Articles
Document Tags:: Sql, Surrogate Keys,
URL:: https://docs.getdbt.com/blog/managing-surrogate-keys
Finished date:: 2023-08-14

Highlights

For my money, the simplicity of using hashed keys far outweighs the potential benefits of having MIISKs in your data model. Building with dbt works best when all parts of your project are idempotent, and hashed keys require close to zero maintenance. The cost of time spent rebuilding your Surrogate Keys in your data models if you can’t recreate them with a simple dbt run usually offsets any modest performance and storage gains you might be able to achieve with MIISKs. (View Highlight)

The analytical warehouses we use now no longer have the same constraints that traditional warehouses had — joins on strings aren’t notably less performant than those on integers, and storing slightly larger values in the surrogate key column is peanuts given the relative cost of storage on these platforms. This strategy also removes the need for tight coupling of transformations to propagate the surrogate key values across our project — anywhere the inputs for the surrogate keys are present, the hashing function produces the same keys, so we can take advantage of parallel processing in our warehouse and avoid the bottlenecks we had before.

Webmentions

Loading webmentions...

Unable to load webmentions. Please try again later.

❤️ Likes

🔄 Reposts

💬 Replies

🔗 Mentions

No webmentions found for this post yet. Be the first to mention it!

Graph View

Metadata
Highlights

Created with Quartz v4.5.1 © 2025