Data Engineering Mastery Course

All of the content and courses at your own pace!

Zach Wilson

Taught by Zach Wilson

Founder at DataExpert.io

What you'll learn

1 year of access
100+ Hours of Content
Free Cloud Tooling
AWS, Astronomer, Spark, Trino
Manage data lake tables with Iceberg
Manage real-time data with Kafka
Build an awesome portfolio!

Learn directly from the experts

Zach Wilson

Zach Wilson

Founder at DataExpert.io

I have led teams of data engineers and software engineers at Airbnb, Facebook, and Netflix. My next goal is to upskill as many data knowledge workers as I can!

Course syllabus

49 lessons • 32+ hours of content • 6 assignments

Dimensional Data Modeling in Iceberg
1
Dimensional Data Modeling - Complex Data Types Arrays and Structs with Iceberg (Day 1 Lecture) - Pre-recorded from V3
2
Dimensional Data Modeling - Complex Data Types Arrays and Structs with Iceberg (Day 1 Lab) - Pre-recorded from V3
3
Dimensional Data Modeling - Slowly-changing Dimensions and Idempotent Queries in Iceberg (Day 2 Lecture) - Pre-recorded from V3
4
Dimensional Data Modeling - Slowly-changing Dimensions and idempotent queries with Trino (Day 2 Lab) - Pre-recorded from V3
5
Conceptual Data Modeling (Day 3 Lecture)
6
Conceptual Data Modeling (Day 3 Lab)
Boot Camp Orientation
1
V3 Boot Camp Kick-Off
2
V4 Boot Camp Kick-Off
V4 Capstone Project
1
Lesson 1
Fact Data Modeling with Apache Iceberg
1
Fact Data Modeling - Core Concepts, Deduplication Techniques, and Retention Considerations with Iceberg (Day 1 Lab) - Pre-recorded from V3
2
Data Accessibility and Privacy (Day 4 Lecture)
3
Fact Data Modeling - Core Concepts, Deduplication Techniques, and Retention Considerations with Iceberg (Day 1 Lecture) - Pre-recorded from V3
4
Fact Data Modeling - Core Elements in Data Modeling with Iceberg (Day 2 Lecture) - Pre-recorded from V3
5
Fact Data Modeling - Minimizing Shuffle and Reducing Facts with Iceberg (Day 3 Lecture) - Pre-recorded from V3
6
Fact Data Modeling - Compact Tables for Efficient Data Representation with Iceberg (Day 2 Lab) - Pre-recorded from V3
7
Fact Data Modeling - Practical Guide to Formatting and Aggregating Data with Iceberg (Day 3 Lab) - Pre-recorded from V3
8
Data Accessibility and Privacy (Day 4 Lab)
Data Quality Patterns in Apache Airflow
1
Airflow Data Quality (Day 1 Lecture)
2
Airflow Data Quality (Day 2 Lab)
3
Airflow Data Quality (Day 1 Lab)
4
Airflow Data Quality (Day 2 Lecture)
Unit and Integration Testing Spark Pipelines
1
Spark Fundamentals Spark Lab on REST API consumption (Day 1 Lab)
2
Advanced Spark (Day 2 Lab)
3
Spark Data Quality (Day 3 Lecture)
4
Advanced Spark (Day 2 Lecture)
5
Spark Data Quality (Day 3 Lab)
6
Spark Fundamentals (Day 1 Lecture)
Streaming Pipelines with Kafka and Spark Streaming
1
Processing data with Apache Flink (Day 2 Lecture)
2
Kafka Fundamentals, consumer APIs with Confluent (Day 1 Lecture)
3
Kafka Fundamentals, consumer APIs with Confluent (Day 1 Lab)
4
Processing data with Apache Flink (Day 2 Lab)
Analytical Patterns & Analysis with Trino
1
Analytical Patterns (Day 2 Lecture)
2
Analytical Patterns (Day 2 Lab)
3
Advanced SQL and Applying Analytical patterns (Day 1 Lecture)
4
Advanced SQL and Applying Analytical patterns (Day 1 Lab)
Large Language Models
1
LLMs - Using LLMs to provide business value Day 1 Lecture
2
LLMs - Vector database for RAG applications Day 2 Lab
3
LLMs - Vector database for RAG applications Day 2 Lecture
4
LLMs - Using LLMs to provide business value Day 1 Lab
Data Impact Communication & Visualization
1
Data Communication and Impact (Day 1 Lecture)
2
Data Visualization with tableau (Day 2 Lab)
3
Data Visualization with tableau (Day 2 Lecture)
4
Data Communication and Impact (Day 1 Lab)
dbt
1
dbt day 1 lecture
2
dbt day 1 lab
3
dbt day 2 lecture
4
dbt day 2 lab
Data Pipeline Maintenance (V4)
1
Data Pipelines Maintenance (Day 1 Lecture)
2
Data Pipelines Maintenance (Day 1 Lab)

Program Resources

See what you'll be working with

Homework overview

Weekly schedule

Before you join

Prerequisites

Proficiency in Python and SQL, at least 6 months of experience in both
Basic understanding of Docker, Flink, and Kafka
Basic understanding of SQL Window Functions

Platform Access Included

SnowflakeSnowflake