Notes on Spark (Definitive Guide)

Notes from Spark Definitive Guide - by Bill and Matei

Context

Low Level APIs

Resilient Distributed Datasets

Advanced RDDs

Distributed Shared Variables

Structured APIs

Basic Operations

Data Types

Aggregations

Joins

Data Sources

Spark SQL

Datasets

Applications

Advanced Analytics and Machine Learning

Streaming

Manage Spark