Blog posts, course materials, forums, books etc. for studying data engineeing technologies.
Continuously updated.
Java
- Algorithms, Part I and II, Robert Sedgewick, Princeton U. Coursera
Scala
- Functional Programming Principles in Scala, Martin Odersky, EPFL, Coursera
- The Neophyte’s Guide to Scala, Daniel Westheide
- Learning Scala part nine – Uniform Access, Joel Abrahamsson
- Functional Programming in Scala, Paul Chiusano and Runar Bjarnason
- https://github.com/fpinscala/fpinscala
Spark
- Simplifying Big Data Applications with Apache Spark 2.0, Matei Zaharia
- A Deeper Understanding of Spark Internals, Aaron Davidson
- Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming, Michael Armbrust
- Online Learning with Structured Streaming, Ram Sriharsha and Vlad Feinberg
- Apache Spark 2.0: A Deep Dive Into Structured Streaming, Tathagata Das
- ETL Is Dead, Long Live Streams: real-time streams w/ Apache Kafka, Neha Narkhede