Sign in to view content

Sign in to view this lesson and continue learning.

Dimensional Data Modeling - Complex Data Types Arrays and Structs with Iceberg (Day 1 Lab)

Week 1: Dimensional Data Modeling in Iceberg
55 mins
SQLData ModelingApache Iceberg

Description

In this lab, we explore Slowly Changing Dimensions (SCDs) and design Cumulative Tables using the `nba_players` dataset. This is our first stress test of our web-based interface linked to a Presto/Trino database running in Starburst. Presto (formerly PrestoSQL) or Trino are both query engines that allow users to query and analyze data in a data lake using SQL, while Starburst offers a distribution of Presto/Trino. During this session, attendees executed 1742 queries in one hour and Starburst handled the workload without any issues! This lab not only demonstrates how to design cumulative tables but also showcases the power of running analytics in the data lake! [Recorded Nov 7, 2023]