Notes on Spark (Definitive Guide) Notes from Spark Definitive Guide - by Bill and Matei Context Map Low level APIs RDD, Distributed Variables Structured APIs Datasets, DataFrames, SQL Applications Analytics, Streaming, Manage Spark Low Level APIs Resilient Distributed Datasets Advanced RDDs Distributed Shared Variables Structured APIs Basic Operations Data Types Aggregations Joins Data Sources Spark SQL Datasets Applications Advanced Analytics and Machine Learning Streaming Manage Spark