Sign in to view content

Sign in to view this lesson and continue learning.

Apache Spark Memory Turning, Partitioning Day 2 Lecture

Description

In this lecture, Zach dives into how to troubleshoot slow Spark jobs, particularly focusing on common bottlenecks that can occur in big tech environments. He discusses the impact of upstream data formats like JSON and CSV, and how they can hinder performance. He also shares insights on optimizing data processing and managing skew data effectively.