As he sees it, data quality and usability issues arise from the conventional best practice of “dumping” data in the warehouse to be manipulated and transformed afterward to fit the needs of the business.
No data enters the warehouse without a business question, process, or problem driving it. Everything is purpose built for the task to be done.
Note: What about not knowing that you will need it in the first place?
What is the incentive for a data scientist that uses an ELT process and finishes their model to go back and document their work?
Note: We must create them somehow
Mapping should be handled either upstream of the warehouse through a streaming database or in the warehouse itself. This layer is where a BI engineer matches what is coming up from engineering to what a data consumer needs, which can be automated to produce Kimball data marts.
Note: So you are doing mostly the same you were doing before…