Sign in to view content

Sign in to view this lesson and continue learning.

Databricks and Advanced Spark Day2 Lab

Description

In this Lab, Zach dives into performance tuning and incremental data processing using a substantial dataset of around 50 gigabytes of tweets. He demonstrates the importance of partitioning. He explores the impact of repartitioning strategies on performance. He encourages everyone to experiment with their own partitioning strategies and to utilize the Spark UI for performance analysis.