Sign in to view content

Sign in to view this lesson and continue learning.

Advanced Spark (Day 2 Lecture)

Week 4: Batch Pipelines with Apache Spark
22 mins

Description

In this lecture, Zach covers the Spark APIs, specifically focusing on Dataset, Dataframe, and Spark SQL. He explains the differences between them and their use cases. Additionally, he discusses PySpark UDFs and when to use them. The lecture provides important insights and recommendations for working with Spark. [Recorded on May30th, 2024]