Spring Data Engineering Boot Camp Starting May 26th, 2025

This will cover all the important things from Spark, Databricks, Delta, Iceberg, and Delta Live Table needed to be amazing data engineers!

Zach Wilson

Taught by Zach Wilson

Founder at DataExpert.io

What you'll learn

✓
10 Industry Expert Speaker Sessions
✓
Lifetime access to content
✓
One year access to Databricks, AWS
✓
Get real cloud experience
✓
Weekly Q&A with Zach
✓
Join referral network

Learn directly from the experts

Zach Wilson

Zach Wilson

Founder at DataExpert.io

I have led teams of data engineers and software engineers at Airbnb, Facebook, and Netflix. My next goal is to upskill as many data knowledge workers as I can!

Course syllabus

235 lessons • 149+ hours of content • 29 assignments

Data Lakes with Delta Table
1
Delta Table Day 1 Lecture
2
Delta Table Day 1 Lab
3
Delta Lake Bonus
4
Delta Table Day 2 Lecture
5
Delta Table Day 2 Lab
6
Test Again
Databricks and Advanced Spark
1
Apache Spark Shuffle Joins Day 1 Lecture
2
Apache Spark Shuffle Joins Day 1 Lab
3
Apache Spark Memory Turning, Partitioning Day 2 Lecture
4
Apache Spark Memory Turning, Partitioning Day 2 Lab
5
Apache Spark Unit Testing Day 3 Lecture
6
Apache Spark Unit Testing Day 3 Lab
7
Setting Up CI/CD and Unit Testing in Databricks for Reliable Data Pipelines
Structured Streaming Kafka to Delta Live Table
1
Apache Spark programming with Databricks Day 1 Lecture
2
Apache Spark programming with Databricks Day 1 Lab
3
Apache Spark programming with Databricks Day 2 Lecture
4
Apache Spark programming with Databricks Day 2 Lab
Managing Unstructured Data
1
Managing Unstructured Data - Day 1 Lecture
2
Managing Unstructured Data - Day 1 Lab
3
Managing Unstructured Data - Day 2 Lecture
4
Managing Unstructured Data - Day 2 Lab
Databricks and Spark Fundamentals (preread)
1
Spark Basics
Q&A with Zach May 2025
1
Q&A with Zach Week 1
2
Q&A with Zach Week 2
3
Q&A with Zach Week 3
4
Q&A with Zach Week 4
5
Q&A with Zach Week 5
6
Navigating Data Engineering -Tips, Tools, and Career Insights
Bonus - Azure
1
Azure - Week 1
2
Azure - Week 2
3
Azure - Week 3
4
Azure - Week 4
5
Azure - Week 5
Tech Talks May 2025
1
Tech Talk - Alex
2
Tech Talk - Joe
3
Tech Talk - Shubham
4
Tech Talk - Vaishali
TA Office Hours
1
TA Office Hour 1
2
TA Office Hour 2
3
TA Office Hour 3
4
TA Office Hour 4
5
TA Office Hour 5
6
TA Office Hour 6
7
TA Office Hour 7
8
TA Office Hour 8
Capstone DE Bootcamp May 2025
1
Capstone May 2025

Also included

IncludedData Engineer Interview Skills
16 lessons

Interview Skills

1The SQL Interview
2The Data Modeling Interview
3The Data Architecture Interview
4The Behavioral Interview
5The Data Structures and Algorithms Interview
6Data Structures and Algorithms Interview
7Product Sense Interview
8Behavioral Interview
9Data Modeling Interview
10Live SQL Training January 25th, Window Functions and Common Table Expressions
11Live SQL Training January 15th
12Scala Dataset vs Dataframe API [Dec 15, 2023]

AI and LLM

1LLM-Driven Data Engineering Day 1 Lab
2LLM-Driven Data Engineering Day 1 Lecture
3LLM-Driven Data Engineering Day 2 Lab
4LLM-Driven Data Engineering Day 2 Lecture
IncludedAnalytics Engineering Boot Camp starting October 2024
49 lessons

Bootcamp Kickoff

1Analytics Engineering Bootcamp Kickoff
2Capstone Requirements
3Understanding Capstone Requirements

Pipeline Spec Building + Airflow Fundamentals

1Write a comprehensive pipeline spec document
2Write a comprehensive pipeline spec document Lab
3Orchestration and Airflow Fundamentals Lecture
4Orchestration and Airflow Fundamentals Lab

Change Data Capture (CDC) and Analytical Patterns

1Change Data Capture (CDC) and Snowflake basics Lecture
2Change Data Capture (CDC) and Snowflake basics Lab
3How to use snowpark and Snowflake optimization Lecture
4How to use snowpark and Snowflake optimization Lab
5Growth Accounting Models Lecture
6Growth Accounting Models Lab
7Grouping sets and Window functions Lecture
8Grouping sets and Window functions Lab

dbt basics

1dbt basics: How dbt works: sources, models, packages Lecture
2dbt basics: How dbt works: sources, models, packages Lab
3dbt basics: creating a full dbt project lecture
4dbt basics: creating a full dbt project lab

Advanced dbt

1Advanced dbt: Understanding incremental models Lecture
2Advanced dbt: Understanding incremental models Lab
3Advanced pipelines: Blue-green and WAP with Airflow Lecture
4Advanced pipelines: Blue-green and WAP with Airflow Lab

Advanced Snowflake

1Advanced snowflake: Deploying the Capstone dag Lecture
2Advanced snowflake: Deploying the Capstone dag Lab
3Advanced Snowflake: Data Modeling and UDFs Lecture
4Advanced Snowflake: Data Modeling and UDFs Lab

Passing Analytics Engineer Interviews

1SQL

Bonus

1Complex Data Types Lecture
2Complex Data Types Lab
3Slowly Changing Dimensions Lecture
4Slowly Changing Dimensions Lab
5Graph Data Modeling Lecture
6Graph Data Modeling Lab
7Fact Data Modeling Lecture
8Fact Data Modeling Lab
9Data modeling with Date lists Lecture
10Data modeling with Date lists Lab

Tech Talk

1Tech talk with Andrew Madson
2Tech talk with Gourav Khanijoe
3Tech talk with Carly Taylor
4Tech talk with Jess Ramos
5Tech talk with Madison Schott
6Tech talk with Vaishali Macwan
7Tech talk with Tim Chan

TA Office Hours

1Office Hour 1
2Office Hour 2
3Office Hour 3
4Office Hour 4
Included2025 New Year Data Engineering Boot Camp starting January 6th
72 lessons

January 2025 Bootcamp Kickoff

1January 2025 Bootcamp Kickoff

Airflow + Trino

1Orchestration and Airflow Fundamentals Day 1 Lecture
2Orchestration and Airflow Fundamentals Day 1 Lab
3Hard Orchestration Lessons Day 2 Lecture
4Hard Orchestration Lessons Day 2 Lab
5Cumulative DAGs in Production Day 3 Lecture
6Cumulative DAGs in Production Day 3 Lab

Snowflake + dbt Basics

1Snowflake Basics Day 1 Lecture
2Snowflake Basics Day 1 Lab
3dbt Basics Day 2 Lecture
4dbt Basics Day 2 Lab
5dbt Basics Day 3 Lecture
6dbt Basics Day 3 Lab

Databricks Basics

1Databricks Platform Overview Day 1 Lecture
2Databricks Platform Overview Day 1 Lab
3Introduction to Spark Day 2 Lecture
4Introduction to Spark Day 2 Lab
5Apache Spark Core Day 3 Lecture
6Apache Spark Core Day 3 Lab

Advanced Spark on Databricks

1Apache Spark Shuffle Joins Day 1 Lecture
2Apache Spark Shuffle Joins Day 1 Lab
3Apache Spark Memory Turning, Partitioning Day 2 Lecture
4Apache Spark Memory Turning, Partitioning Day 2 Lab
5Apache Spark Unit Testing Day 3 Lecture
6Apache Spark Unit Testing Day 3 Lab

Snowflake + Advanced dbt

1Snowflake Lecture
2Snowflake Lab
3Advanced dbt Day 2 Lecture
4Advanced dbt Day 2 Lab
5Advanced dbt Day 3 Lecture
6Advanced dbt Day 3 Lab

Analytical Patterns and Advanced SQL

1Applying Analytical Patterns Day 1 Lecture
2Applying Analytical Patterns Day 1 Lab
3Advanced SQL Patterns Day 2 Lecture
4Advanced SQL Patterns Day 2 Lab
5Analytical Patterns Recognizing Business Value Day 3 Lecture
6Analytical Patterns Recognizing Business Value Day 3 Lab

Real Time Data (Spark and Kafka Streaming)

1Advanced Spark Optimization Techniques Day 1 Lecture
2Advanced Spark Optimization Techniques Day 1 Lab
3Spark Structured Streaming Day 2 Lecture
4Spark Structured Streaming Day 2 Lab
5Deep Dive On Workflows Day 3 Lecture
6Deep Dive On Workflows Day 3 Lab

Bonus - LLMs

1RAG and LLMs Day1 Lecture
2RAG and LLMs Day1 Lab
3RAG and LLMs Day2 Lecture
4RAG and LLMs Day2 Lab
5LLMs Day 3 Part1
6LLMs Day3 Part2

Career Development Sessions

1Career Development - LinkedIn Optimization
2Career Development - Resume Review
3Career Development - Interview Help
4Career Development - Data Modeling Interview
5Career Development - Strategic Networking

Q&A with Zach Wilson

1Q&A Week 1
2Q&A Week 2
3Q&A Week 3
4Q&A Week 4

Guest Speaker Sessions

1Jason Reid (cofounder of Tabular)
2Shachar Meir
3Brian Pulliam
4Sundas Khalid
5YZ
6Prasad Rao
7Joe Reis

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6

Capstone Showcase Jan 2025

1Capstone Showcase Jan 2025
IncludedBoot Camp: Community Edition
41 lessons

Bootcamp Orientation

1Bootcamp Kickoff
2Boot Camp Database Setup

Dimensional Data Modeling

1Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lecture
2Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lab
3Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lecture
4Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lab
5Dimensional Data Modeling: Graph Data Modeling Day 3 Lecture
6Dimensional Data Modeling: Graph Data Modeling Day 3 Lab

Fact Data Modeling

1Fact Data Modeling: Core Concepts, Deduplication Day 1 Lecture
2Fact Data Modeling: Practical Insights into Data Modeling Day 1 Lab
3Fact Data Modeling: Core Elements in Data Modeling Day 2 Lecture
4Fact Data Modeling: Compact Tables for Efficient Data Representation Day 2 Lab
5Fact Data Modeling: Minimizing Shuffle and Reducing Facts Day 3 Lecture
6Fact Data Modeling: Practical Guide to Formatting and Aggregating Data Day 3 Lab

Apache Spark Fundamentals

1Apache Spark: Architecture, Optimization, and Best Practices Day 1 Lecture
2Apache Spark: Hands-On for Broadcast and Hash Joins Day 1 Lab
3Apache Spark: Managing Spark Jobs and Notebooks Day 2 Lecture
4Apache Spark: User-Defined Functions and Broadcast Join Day 2 Lab
5Unit Testing Spark Jobs: Importance, Challenges, and Leadership Perspectives Lecture
6Unit Testing Spark Jobs: Mastering Spark and PySpark Testing Lab

Applying Analytical Patterns

1Applying Analytical Patterns: Exploring SQL, Scaling Projects and Aggregation Analysis Day 1 Lecture
2Applying Analytical Patterns: Mastering Growth Accounting and Retention Analysis Day 1 Lab
3Applying Analytical Patterns: Recursive CTEs and Window Functions Day 2 Lecture
4Applying Analytical Patterns: Aggregations and Cardinality Reduction Day 2 Lab

Real-time pipelines with Flink and Kafka

1Flink Lab Setup
2Streaming Pipelines: Mastering Streaming and Real-time Pipelines Day 1 Lecture
3Streaming Pipelines: Setting up Streaming Pipelines Day 1 Lab
4Streaming Pipelines: Exploring Data Collection and Processing Day 2 Lecture
5Streaming Pipelines: Kafka, Postgres, Spark Integrations and Parallelism Day 2 Lab

Data Visualization and Impact

1Data Visualization and Impact: Mastering Data Engineering Day 1 Lecture
2Data Visualization and Impact: Hands-On with the CSV files Day 1 Lab
3Data Visualization and Impact: Insights and Best Practices Day 2 Lecture
4Data Visualization and Impact: Exploring Data Visualization and Aggregation Techniques Day 2 Lab

Data Pipeline Maintenance

1Data Pipeline Maintenance: Navigating the Complexities of Data Engineering Day 1 Lecture
2Data Pipeline Maintenance: Strategies for Maintenance and Dock Building Day 2 Lecture

KPIs and Experimentation

1KPIs and Experimentation: Decoding Business Success: Metrics, Growth Strategies and Collaborative Approaches Day 1 Lecture
2KPIs and Experimentation: Setting up and Analysing Experiments Day 1 Lab
3KPIs and Experimentation: Leading and Lagging Metrics Day 2 Lecture

Data Quality Patterns

1Data Quality Patterns: MIDAS Process from Airbnb Day 1 Lecture
2Data Quality Patterns: Spec-Building Document Day 1 Lab
3Data Quality Patterns: WAP Patterns Day 2 Lecture
IncludedBuilding Pipelines with Iceberg and Airflow
11 lessons

Data Modeling with Iceberg and Trino

1The History of Data Lakes Lecture
2Iceberg Partitioning and Metadata Exploration Lab
3Mastering Data Lake Architectures Lecture
4Apache Iceberg Day 2 Lab
5Apache Iceberg Day 3 Lecture
6Apache Iceberg Day 3 Lab

Airflow Pipelines with Iceberg

1Setting Up Airflow for Week1 for Mac
2Orchestration and Airflow Fundamentals Lecture
3Orchestration and Airflow Fundamentals Lab
4Apache Iceberg Data Contracts Lecture
5Apache Iceberg Data Contracts Lab

Program Resources

See what you'll be working with

Program overview

Weekly schedule

Before you join

Prerequisites

One year of Python and SQL experience
Exposure to Spark and Docker

Platform Access Included

AWSAWS
DatabricksDatabricks