Metadata
- URL:: https://databricks.com/blog/2016/07/28/continuous-applications-evolving-streaming-in-apache-spark-2-0.html
- Author:: Matei Zaharia
- Publisher:: databricks.com
- Published Date:: 2016-07-28
- Tags::
Highlights
- Structured Streaming job is the same as running the batch job on a prefix of the input data.
- ensuring each record appears in the output exactly once, and recovering the job’s state if you restart it. Finally, to serve this data interactively instead of writing it to Parquet, we could just change writeStream to use the (currently alpha) in-memory sink and connect a JDBC client to Spark to query it.