Metadata
- URL: https://blog.devgenius.io/why-is-snowflake-so-expensive-92b67203945
- Published Date: 2022-08-17
- Author: Stas Sajin
Highlights
- Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue. In a typical Innovator’s Dilemma, Snowflake prioritizes other things that generate an ever larger menu of compute options, like Snowpark and data apps built on Streamlit, that will bleed your organization dry.
- Note: Snowflake, Cloud costs
- For example, what person at Google thought it was a good idea to let BigQuery perform a full table scan when running this query?
- The choice to scan all the data might seem dumb, but it’s not when you realize that they charge per Gb of data scanned, and it’s in their interest to leave optimization gremlins in.
- For folks in technical or engineering backgrounds, this is a red flag. Whether your query runs on a machine with SSD or hard drive, low or high RAM, slow or fast CPU, high or low network bandwidth makes a measurable impact on performance. Snowflake is very secretive about their hardware
- Micro-partitioning pruning is disabled based on predicates in a sub-query, even if the sub-query returns a constant.
- If you use DBT, you should really be mindful of this issue because this query pattern is very common and represents north of 50% of query costs.
- The query workload manager in Snowflake is inefficient.
- you’ll generally find an interesting pattern where 5% of your internal users drive 95% of your costs.
- today I would probably pay $25 and ask users to take the online SQL classes from DataCamp for a few days and just learn the craft efficiently and more scalably.
- If your yearly costs are above $500k mark, it’s useful to consider the benefits of an off-ramp.
The article has been later critiqued by Tristan Hardy and Felipe Hoppa: https://twitter.com/jthandy/status/1564359979926265861?s=20&t=S5gKxiyLRT_39xVEKigPDg