rw-book-cover

Metadata

Highlights

Any data is useless without an accurate description. And forget about just writing “This field shows the company’s revenue over a period of time” - if you really want to give access to LLM to your business users and forget about it, you will have to build a truly sophisticated data governance landscape, not only describing the fields themselves but also the possible values and the business processes behind the data. Also, your glossary engine (is it an internal wiki, a folder with Google docs, or a bunch of Markdown files) should have a clear LLM-friendly structure and be accessible by the LLM. Now, this is also possible with the new MCP mechanics. Luckily, as at ClickHouse we are obsessed with data, we already had a pretty good internal data wiki that uses MDBook for creating pages. This wiki covers every notable field in all our data marts, with a description of the business processes behind it. One last step was to expose it to the LLM via the GitHub MCP server. (View Highlight)

During the last few months, a number of open-source UI products for LLMs made significant progress. For our DWH at ClickHouse, I chose the LibreChat project - an open-source UI that works with any popular LLMs. It supports MCP servers, visual artifacts, SSO, sharing chats, forks and many other features. It also allows you to switch LLM providers easily if needed. (View Highlight)