Scaling our data stack with kafka and real time stream processing

![rw-book-cover](https://miro.medium.com/max/1200/0*1wiQHz6h7-R35VzT) ## Metadata - Author: [[whatnot-engineering|Whatnot Engineering]] - Full Title:: Scaling Our Data Stack With Kafka and Real-Time Stream Processing - Category:: #🗞️Articles - Document Tags:: [[Streaming architecture for real time analytics|Streaming Architecture For Real Time Analytics]] - URL:: https://medium.com/whatnot-engineering/scaling-our-data-stack-with-kafka-and-real-time-stream-processing-56554dcbb0fc - Finished date:: [[2023-02-11]] ## Highlights > the main requirement for this system was to decouple how features store their data from how they report data ([View Highlight](https://read.readwise.io/read/01gs0qnrf40qjv64zxvj5q8jea)) ##### Testing event producers > we’d run a local Kafka cluster in CI/CD using Docker. ([View Highlight](https://read.readwise.io/read/01gs0qr62xmb6agczfjdrw2kw1)) ##### Schemas > There’s a spectrum here — one extreme chooses a single schema for all event types and the other a bespoke schema for each individual event type. > We landed somewhere in the middle — we looked at our use cases and came up with a few event schemas that covered the majority of the event types ([View Highlight](https://read.readwise.io/read/01gs0qta6sbbwgdryqv9amqkef)) ## New highlights added [[2023-02-11]] > ![](https://miro.medium.com/max/1400/0*1wiQHz6h7-R35VzT) ([View Highlight](https://read.readwise.io/read/01gs0wd4rzw0mymjxk292f9gt5))