Metadata

Highlights

  • (Or in some cases, decades-old SWE practices have been re-discovered by data teams.)
  • Systems Tend Towards Blind Federation
  • Systems Tend Towards Layerinitis
  • Reduce the areas where business logic can be injected,
  • “Top Items You Don’t Carry” into the production app that all retailers run on. The data team moves their output into an AWS S3 bucket,
  • the engineering team in charge of consuming inventory feeds (different from the team in charge of the retailer application) migrates to a new inventory schema. They are not aware of a single step of the “Top Items You Don’t Carry” project, the dependencies the data team has built silently on top of their work, or the dependencies others have built on top of the data team’s work. They delete the initial table. NULLs flow into Looker, Salesforce, Hubspot, and the production retailer application. The data team has broken prod.
  • The impact and career trajectory of a data professional is limited by the surface area they can influence. The business intelligence analyst of 2012 was capped at Tableau + internal presentations. The data professional of today CAN put rows into Salesforce, trigger marketing emails, and build data products for consumption in production services and applications. This is awesome news!
  • The modern data stack makes it incredibly easy to productionize data outputs — regardless of if they should be productionized, or if the team who built the inputs knows how the outputs are being consumed.
  • a company wide alignment that data can be deliberately created, not painfully extracted.
  • use an orchestration tool (Dagster/Prefect/Airflow) as a single control plane.
  • Merge the dependencies between tools and create a holistic DAG that runs based on prior step successes as opposed to hoping the upstream tasks succeed by a certain time of day. Rebundle.
  • Whenever possible, do not write business logic in these tools. If you follow the one-to-one mapping of data team exports to downstream use cases, your Reverse ETL can always be: select * from exp_table_for_single_use_case; Changes in business logic should be applied to that dbt model, instead of in the last mile.
  • People default to writing business logic in the tool they are most comfortable with.