Purchase Required

You need to purchase this content in order to view it

Apache Spark Memory Turning, Partitioning Day 2 Lecture

Module
39 mins

Description

In this lecture, Zach dives into how to troubleshoot slow Spark jobs, particularly focusing on common bottlenecks that can occur in big tech environments. He discusses the impact of upstream data formats like JSON and CSV, and how they can hinder performance. He also shares insights on optimizing data processing and managing skew data effectively.