Winter 2026 Data Engineering Boot Camp

Master Databricks on AWS

Eumar Assis
Zach Wilson

Taught by Eumar Assis & Zach Wilson

Solutions Architect Manager at Databricks & Founder at DataExpert.io

What you'll learn

✓
Lifetime access to content
✓
5 industry leading guest speakers
✓
Join a vibrant learning community
✓
Join our referral network for Jobs
✓
Build a standout capstone project
✓
Best for engineers/analyst looking to switch to Data Engineering or Data Engineers looking to upskill

Learn directly from the experts

Eumar Assis

Eumar Assis

Solutions Architect Manager at Databricks

Zach Wilson

Zach Wilson

Founder at DataExpert.io

I have led teams of data engineers and software engineers at Airbnb, Facebook, and Netflix. My next goal is to upskill as many data knowledge workers as I can!

Course syllabus

284 lessons • 185+ hours of content • 34 assignments

Kickoff
1
Databricks Boot Camp Kickoff
2
Capstone Project Brainstorming
Data Lakes with Delta Table
1
Data Lakes with Delta Table Day1 Lecture
2
Data Lakes with Delta Table Day1 Lab
3
Data Lakes with Delta Table Day2 Lecture
4
Data Lakes with Delta Table Day2 Lab
Databricks and Advanced Spark
1
Databricks and Advanced Spark Day1 Lecture
2
Databricks and Advanced Spark Day1 Lab
3
Databricks and Advanced Spark Day2 Lecture
4
Databricks and Advanced Spark Day2 Lab
Managing Unstructured Data
1
Managing Unstructured Data Day1 Lecture
2
Managing Unstructured Data Day1 Lab
3
Managing Unstructured Data Day2 Lecture
4
Managing Unstructured Data Day2 Lab
Structured Streaming Kafka to Delta Live Table
1
Structured Streaming Kafka to Delta Live Table Day1 Lecture
2
Structured Streaming Kafka to Delta Live Table Day1 Lab
3
Structured Streaming Kafka to Delta Live Table Day2 Lecture
4
Structured Streaming Kafka to Delta Live Table Day2 Lab
5
Exploring UDFs and SQL Benchmarks in Spark Streaming
Building AI Agents with Databricks
1
Building AI Agents with Databricks Day1 Lecture
2
Building AI Agents with Databricks Day1 Lab
3
Building AI Agents with Databricks Day2 Lecture
4
Building AI Agents with Databricks Day2 Lab
5
Enhancements and Implementations in MLflow
Q&A with Zach
1
Q&A with Zach Week1
2
Q&A with Zach Week2
3
Q&A with Zach Week3
4
Q&A with Zach Week4
5
Q&A with Zach Week5
Tech Talks
1
Tech Talk - Shubham
2
Tech Talk - Brian
3
Tech Talks 3
4
Tech Talk - Xinran
5
Tech Talk - Jason
TA Office hours
1
TA Office hour
2
TA Office hour 1
3
TA Office hour 2
4
TA Office hour 3
5
TA Office hour 4
6
TA Office hour 5
7
TA Office hour 6
8
TA Office hour 7
9
TA Office hour 8
Book Club
1
Book Club 1
2
Book Club 2
3
Book Club 3
4
Book Club 4
5
Book Club 5
Capstone Project
1
Lesson 1

Also included

IncludedData Engineer Interview Skills
16 lessons

Interview Skills

1The SQL Interview
2The Data Modeling Interview
3The Data Architecture Interview
4The Behavioral Interview
5The Data Structures and Algorithms Interview
6Data Structures and Algorithms Interview
7Product Sense Interview
8Behavioral Interview
9Data Modeling Interview
10Live SQL Training January 25th, Window Functions and Common Table Expressions
11Live SQL Training January 15th
12Scala Dataset vs Dataframe API [Dec 15, 2023]

AI and LLM

1LLM-Driven Data Engineering Day 1 Lab
2LLM-Driven Data Engineering Day 1 Lecture
3LLM-Driven Data Engineering Day 2 Lab
4LLM-Driven Data Engineering Day 2 Lecture
IncludedAnalytics Engineering Boot Camp starting October 2024
49 lessons

Bootcamp Kickoff

1Analytics Engineering Bootcamp Kickoff
2Capstone Requirements
3Understanding Capstone Requirements

Pipeline Spec Building + Airflow Fundamentals

1Write a comprehensive pipeline spec document
2Write a comprehensive pipeline spec document Lab
3Orchestration and Airflow Fundamentals Lecture
4Orchestration and Airflow Fundamentals Lab

Change Data Capture (CDC) and Analytical Patterns

1Change Data Capture (CDC) and Snowflake basics Lecture
2Change Data Capture (CDC) and Snowflake basics Lab
3How to use snowpark and Snowflake optimization Lecture
4How to use snowpark and Snowflake optimization Lab
5Growth Accounting Models Lecture
6Growth Accounting Models Lab
7Grouping sets and Window functions Lecture
8Grouping sets and Window functions Lab

dbt basics

1dbt basics: How dbt works: sources, models, packages Lecture
2dbt basics: How dbt works: sources, models, packages Lab
3dbt basics: creating a full dbt project lecture
4dbt basics: creating a full dbt project lab

Advanced dbt

1Advanced dbt: Understanding incremental models Lecture
2Advanced dbt: Understanding incremental models Lab
3Advanced pipelines: Blue-green and WAP with Airflow Lecture
4Advanced pipelines: Blue-green and WAP with Airflow Lab

Advanced Snowflake

1Advanced snowflake: Deploying the Capstone dag Lecture
2Advanced snowflake: Deploying the Capstone dag Lab
3Advanced Snowflake: Data Modeling and UDFs Lecture
4Advanced Snowflake: Data Modeling and UDFs Lab

Passing Analytics Engineer Interviews

1SQL

Bonus

1Complex Data Types Lecture
2Complex Data Types Lab
3Slowly Changing Dimensions Lecture
4Slowly Changing Dimensions Lab
5Graph Data Modeling Lecture
6Graph Data Modeling Lab
7Fact Data Modeling Lecture
8Fact Data Modeling Lab
9Data modeling with Date lists Lecture
10Data modeling with Date lists Lab

Tech Talk

1Tech talk with Andrew Madson
2Tech talk with Gourav Khanijoe
3Tech talk with Carly Taylor
4Tech talk with Jess Ramos
5Tech talk with Madison Schott
6Tech talk with Vaishali Macwan
7Tech talk with Tim Chan

TA Office Hours

1Office Hour 1
2Office Hour 2
3Office Hour 3
4Office Hour 4
Included2025 New Year Data Engineering Boot Camp starting January 6th
72 lessons

January 2025 Bootcamp Kickoff

1January 2025 Bootcamp Kickoff

Airflow + Trino

1Orchestration and Airflow Fundamentals Day 1 Lecture
2Orchestration and Airflow Fundamentals Day 1 Lab
3Hard Orchestration Lessons Day 2 Lecture
4Hard Orchestration Lessons Day 2 Lab
5Cumulative DAGs in Production Day 3 Lecture
6Cumulative DAGs in Production Day 3 Lab

Snowflake + dbt Basics

1Snowflake Basics Day 1 Lecture
2Snowflake Basics Day 1 Lab
3dbt Basics Day 2 Lecture
4dbt Basics Day 2 Lab
5dbt Basics Day 3 Lecture
6dbt Basics Day 3 Lab

Databricks Basics

1Databricks Platform Overview Day 1 Lecture
2Databricks Platform Overview Day 1 Lab
3Introduction to Spark Day 2 Lecture
4Introduction to Spark Day 2 Lab
5Apache Spark Core Day 3 Lecture
6Apache Spark Core Day 3 Lab

Advanced Spark on Databricks

1Apache Spark Shuffle Joins Day 1 Lecture
2Apache Spark Shuffle Joins Day 1 Lab
3Apache Spark Memory Turning, Partitioning Day 2 Lecture
4Apache Spark Memory Turning, Partitioning Day 2 Lab
5Apache Spark Unit Testing Day 3 Lecture
6Apache Spark Unit Testing Day 3 Lab

Snowflake + Advanced dbt

1Snowflake Lecture
2Snowflake Lab
3Advanced dbt Day 2 Lecture
4Advanced dbt Day 2 Lab
5Advanced dbt Day 3 Lecture
6Advanced dbt Day 3 Lab

Analytical Patterns and Advanced SQL

1Applying Analytical Patterns Day 1 Lecture
2Applying Analytical Patterns Day 1 Lab
3Advanced SQL Patterns Day 2 Lecture
4Advanced SQL Patterns Day 2 Lab
5Analytical Patterns Recognizing Business Value Day 3 Lecture
6Analytical Patterns Recognizing Business Value Day 3 Lab

Real Time Data (Spark and Kafka Streaming)

1Advanced Spark Optimization Techniques Day 1 Lecture
2Advanced Spark Optimization Techniques Day 1 Lab
3Spark Structured Streaming Day 2 Lecture
4Spark Structured Streaming Day 2 Lab
5Deep Dive On Workflows Day 3 Lecture
6Deep Dive On Workflows Day 3 Lab

Bonus - LLMs

1RAG and LLMs Day1 Lecture
2RAG and LLMs Day1 Lab
3RAG and LLMs Day2 Lecture
4RAG and LLMs Day2 Lab
5LLMs Day 3 Part1
6LLMs Day3 Part2

Career Development Sessions

1Career Development - LinkedIn Optimization
2Career Development - Resume Review
3Career Development - Interview Help
4Career Development - Data Modeling Interview
5Career Development - Strategic Networking

Q&A with Zach Wilson

1Q&A Week 1
2Q&A Week 2
3Q&A Week 3
4Q&A Week 4

Guest Speaker Sessions

1Jason Reid (cofounder of Tabular)
2Shachar Meir
3Brian Pulliam
4Sundas Khalid
5YZ
6Prasad Rao
7Joe Reis

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6

Capstone Showcase Jan 2025

1Capstone Showcase Jan 2025
IncludedBoot Camp: Community Edition
41 lessons

Bootcamp Orientation

1Bootcamp Kickoff
2Boot Camp Database Setup

Dimensional Data Modeling

1Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lecture
2Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lab
3Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lecture
4Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lab
5Dimensional Data Modeling: Graph Data Modeling Day 3 Lecture
6Dimensional Data Modeling: Graph Data Modeling Day 3 Lab

Fact Data Modeling

1Fact Data Modeling: Core Concepts, Deduplication Day 1 Lecture
2Fact Data Modeling: Practical Insights into Data Modeling Day 1 Lab
3Fact Data Modeling: Core Elements in Data Modeling Day 2 Lecture
4Fact Data Modeling: Compact Tables for Efficient Data Representation Day 2 Lab
5Fact Data Modeling: Minimizing Shuffle and Reducing Facts Day 3 Lecture
6Fact Data Modeling: Practical Guide to Formatting and Aggregating Data Day 3 Lab

Apache Spark Fundamentals

1Apache Spark: Architecture, Optimization, and Best Practices Day 1 Lecture
2Apache Spark: Hands-On for Broadcast and Hash Joins Day 1 Lab
3Apache Spark: Managing Spark Jobs and Notebooks Day 2 Lecture
4Apache Spark: User-Defined Functions and Broadcast Join Day 2 Lab
5Unit Testing Spark Jobs: Importance, Challenges, and Leadership Perspectives Lecture
6Unit Testing Spark Jobs: Mastering Spark and PySpark Testing Lab

Applying Analytical Patterns

1Applying Analytical Patterns: Exploring SQL, Scaling Projects and Aggregation Analysis Day 1 Lecture
2Applying Analytical Patterns: Mastering Growth Accounting and Retention Analysis Day 1 Lab
3Applying Analytical Patterns: Recursive CTEs and Window Functions Day 2 Lecture
4Applying Analytical Patterns: Aggregations and Cardinality Reduction Day 2 Lab

Real-time pipelines with Flink and Kafka

1Flink Lab Setup
2Streaming Pipelines: Mastering Streaming and Real-time Pipelines Day 1 Lecture
3Streaming Pipelines: Setting up Streaming Pipelines Day 1 Lab
4Streaming Pipelines: Exploring Data Collection and Processing Day 2 Lecture
5Streaming Pipelines: Kafka, Postgres, Spark Integrations and Parallelism Day 2 Lab

Data Visualization and Impact

1Data Visualization and Impact: Mastering Data Engineering Day 1 Lecture
2Data Visualization and Impact: Hands-On with the CSV files Day 1 Lab
3Data Visualization and Impact: Insights and Best Practices Day 2 Lecture
4Data Visualization and Impact: Exploring Data Visualization and Aggregation Techniques Day 2 Lab

Data Pipeline Maintenance

1Data Pipeline Maintenance: Navigating the Complexities of Data Engineering Day 1 Lecture
2Data Pipeline Maintenance: Strategies for Maintenance and Dock Building Day 2 Lecture

KPIs and Experimentation

1KPIs and Experimentation: Decoding Business Success: Metrics, Growth Strategies and Collaborative Approaches Day 1 Lecture
2KPIs and Experimentation: Setting up and Analysing Experiments Day 1 Lab
3KPIs and Experimentation: Leading and Lagging Metrics Day 2 Lecture

Data Quality Patterns

1Data Quality Patterns: MIDAS Process from Airbnb Day 1 Lecture
2Data Quality Patterns: Spec-Building Document Day 1 Lab
3Data Quality Patterns: WAP Patterns Day 2 Lecture
IncludedSpring Data Engineering Boot Camp Starting May 26th, 2025
46 lessons

Data Lakes with Delta Table

1Delta Table Day 1 Lecture
2Delta Table Day 1 Lab
3Delta Lake Bonus
4Delta Table Day 2 Lecture
5Delta Table Day 2 Lab
6Test Again

Databricks and Advanced Spark

1Apache Spark Shuffle Joins Day 1 Lecture
2Apache Spark Shuffle Joins Day 1 Lab
3Apache Spark Memory Turning, Partitioning Day 2 Lecture
4Apache Spark Memory Turning, Partitioning Day 2 Lab
5Apache Spark Unit Testing Day 3 Lecture
6Apache Spark Unit Testing Day 3 Lab
7Setting Up CI/CD and Unit Testing in Databricks for Reliable Data Pipelines

Structured Streaming Kafka to Delta Live Table

1Apache Spark programming with Databricks Day 1 Lecture
2Apache Spark programming with Databricks Day 1 Lab
3Apache Spark programming with Databricks Day 2 Lecture
4Apache Spark programming with Databricks Day 2 Lab

Managing Unstructured Data

1Managing Unstructured Data - Day 1 Lecture
2Managing Unstructured Data - Day 1 Lab
3Managing Unstructured Data - Day 2 Lecture
4Managing Unstructured Data - Day 2 Lab

Databricks and Spark Fundamentals (preread)

1Spark Basics

Q&A with Zach May 2025

1Q&A with Zach Week 1
2Q&A with Zach Week 2
3Q&A with Zach Week 3
4Q&A with Zach Week 4
5Q&A with Zach Week 5
6Navigating Data Engineering -Tips, Tools, and Career Insights

Bonus - Azure

1Azure - Week 1
2Azure - Week 2
3Azure - Week 3
4Azure - Week 4
5Azure - Week 5

Tech Talks May 2025

1Tech Talk - Alex
2Tech Talk - Joe
3Tech Talk - Shubham
4Tech Talk - Vaishali

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6
7TA Office Hour 7
8TA Office Hour 8

Capstone DE Bootcamp May 2025

1Capstone May 2025
IncludedBuilding Pipelines with Iceberg and Airflow
11 lessons

Data Modeling with Iceberg and Trino

1The History of Data Lakes Lecture
2Iceberg Partitioning and Metadata Exploration Lab
3Mastering Data Lake Architectures Lecture
4Apache Iceberg Day 2 Lab
5Apache Iceberg Day 3 Lecture
6Apache Iceberg Day 3 Lab

Airflow Pipelines with Iceberg

1Setting Up Airflow for Week1 for Mac
2Orchestration and Airflow Fundamentals Lecture
3Orchestration and Airflow Fundamentals Lab
4Apache Iceberg Data Contracts Lecture
5Apache Iceberg Data Contracts Lab

Program Resources

See what you'll be working with

Program overview

Homework overview

Weekly schedule

Before you join

Prerequisites

Intermediate Python skills
Some spark exposure

Platform Access Included

DiscordDiscord
GitHubGitHub
DatabricksDatabricks
AWSAWS

Scholarship Available

We believe great learning shouldn't be limited by finances. Apply for a partial or full scholarship to access this program.

Apply Now

Social proof

What students say