Apache Spark is written in Scala. Hence, many if not most data developers adopting Spark are also adopting Scala, while Python and R remain popular with data scientists. I think that Spark shows Scala at its best and largely hides the more difficult aspects of the language. This tutorial introduces you to the core features of Scala you need to be productive with Spark quickly, using hands-on exercises with the Spark APIs. It's designed for developers, data scientists interested in using Scala for Spark. Using hands-on examples, you'll learn the most important Scala syntax, idioms, and APIs for Spark development. Prerequisites
Topics covered include:
Trainer: Chaoran Yu, Software Engineer, Fast Data, Lightbend Inc.
What are the essential components of a data platform? This tutorial will explain how the various parts of the Hadoop, Spark and big data ecosystem fit together in production to create a data platform supporting batch, interactive, and real-time analytical workloads.
By tracing the flow of data from source to output, we'll explore the options and considerations for components, including:
We'll also give advice on:
Instructors: John Akred, Stephen O'Sullivan, and Andrew Ray