rw-book-cover

Metadata

Highlights

One of the small ironies of today’s SQL chatbots is that they help people do exactly a thing that data teams try to discourage. As analysts, we ask our colleagues to help us understand how our work will be used. They shouldn’t request some piece of data; they should instead tell us what they’re trying to achieve. And if they don’t tell us what they want to use some data pull for, the less tactful among us pepper them with demands to explain why they need it. (View Highlight)

we don’t recommend that analysts ask these questions just to make their jobs easier; we also recommend it because we can’t give a useful answer without it. (View Highlight)

But, encoding “business context” into some YAML file sounds ridiculous, and describing every detail to a chatbot anytime you want to answer a question sounds exhausting.6 For this reason (among others) I’m generally skeptical that these bots will be that revolutionary (View Highlight)

Then, just as they’d do for a junior analyst, the producer would send back feedback: This number looks off; this explanation doesn’t quite make sense; can you dig into this unexpected anomaly? The bot creates another draft, the exec gives more direction, and so on. (View Highlight)

Second, the back-and-forth could also help people ask better questions. We often don’t know what we want until we start looking for it. Just as it’s almost impossible to write a perfect product spec without testing an imperfect prototype first, it’s very hard to ask exactly the right question before seeing the answers to a few of the wrong questions. A smol analyst would encourage this sort of iterative exploration, which is good for both user and agent (View Highlight)

There is, however, at least one very big reason why a smol analyst wouldn’t be as useful as a smol developer. In software, how code works is in some sense irrelevant; all that matters is that it works. I can test my ad blocking Chrome extension without knowing a line of Javascript, or that Javascript exists at all. If the tool does what I want it to, it works, no matter how “bad” its codebase. In data, black boxes don’t work. Computational process matters. You can’t validate a dashboard by testing that it produces a reasonable-looking chart; you have to make sure that the logic behind its calculations are correct. SQL is declarative, but used for imperative ends—we need to know how it works, step by step. Software is the opposite (View Highlight)