All-Access DataExpert.io Subscription

This gets you access to all self-paced content and the discord community of highly motivated learners!

Bruno Souza de Lima
Zach Wilson

Taught by Bruno Souza de Lima & Zach Wilson

Data Engineer at phData & Founder at DataExpert.io

What you'll learn

✓
250+ Hours of Content
✓
Build an awesome portfolio!
✓
Join the dedicated community

Learn directly from the experts

Bruno Souza de Lima

Bruno Souza de Lima

Data Engineer at phData

Zach Wilson

Zach Wilson

Founder at DataExpert.io

Course syllabus

437 lessons • 292+ hours of content • 50 assignments

Onboarding
1
Onboarding Lesson
2
TA Office hour
Speaker Events Archive
1
Data Engineering Interviews with Swetha at OpenAI
2
Future of Data Engineering with Archita

Also included

IncludedData Engineer Interview Skills
16 lessons

Interview Skills

1The SQL Interview
2The Data Modeling Interview
3The Data Architecture Interview
4The Behavioral Interview
5The Data Structures and Algorithms Interview
6Data Structures and Algorithms Interview
7Product Sense Interview
8Behavioral Interview
9Data Modeling Interview
10Live SQL Training January 25th, Window Functions and Common Table Expressions
11Live SQL Training January 15th
12Scala Dataset vs Dataframe API [Dec 15, 2023]

AI and LLM

1LLM-Driven Data Engineering Day 1 Lab
2LLM-Driven Data Engineering Day 1 Lecture
3LLM-Driven Data Engineering Day 2 Lab
4LLM-Driven Data Engineering Day 2 Lecture
IncludedLinkedIn Mastery Course
6 lessons

LinkedIn Masterclass

1How to write non-boring content by Carly Taylor
2Your story matters and how to tell it by Stephanie Nuesi
3Monetizing Part 3: Products & Funnels by Zach Wilson
4Monetizing Part 2: Sponsorship by Carly Taylor
5Build community, not content by Zach Wilson
6Monetizing Part 1: Brands & Coaching by Stephanie Nuesi
IncludedFullStackExpert.io All-access Subscription
34 lessons

Fullstack Bootcamp Orientation

1Full Stack Bootcamp Orientation

Boot camp Kickoff

1Lesson 1

Web Dev Fundamentals

1Introduction to IDEs
2Structure of a Program
3Intro to HTML & CSS
4HTML & CSS LAB
5Intro to Javascript
6Advanced JavaScript Concepts
7Javascript lab part 2

Intermediate Frontend Development

1Introduction to ReactJS
2Advanced ReactJS and Tailwind

Fundamentals of Backend Development

1Introduction to databases and relational SQL
2Data Modeling and intro to backend

Intermediate Backend Development

1Intro to NodeJS & Express and event driven programming
2Connecting the database to the service

Advanced Backend Development

1Authentication and Advanced techniques
2Lab 1
3Event driven microservices
4Monitoring, scaling, experimenting
5week-5-Lab 2

Advanced Frontend Development

1Nextjs Introduction
2Week 6 lab part 1
3Advanced NextJS Features
4Week 6 lab part 2
5Docker Vercel and Optimization

Office Hours

1Unit Testing
2Integration Testing
3E2E Testing

Q&A with Dan

1Intro to CI/CD
2Intro to System Design
3Introduction to Containers
4Intro to Infrastructure as Code

Soft Skills

1Joining a new team
2Communicating Ideas and Getting Alignment
IncludedAnalytics Engineering Boot Camp starting October 2024
49 lessons

Bootcamp Kickoff

1Analytics Engineering Bootcamp Kickoff
2Capstone Requirements
3Understanding Capstone Requirements

Pipeline Spec Building + Airflow Fundamentals

1Write a comprehensive pipeline spec document
2Write a comprehensive pipeline spec document Lab
3Orchestration and Airflow Fundamentals Lecture
4Orchestration and Airflow Fundamentals Lab

Change Data Capture (CDC) and Analytical Patterns

1Change Data Capture (CDC) and Snowflake basics Lecture
2Change Data Capture (CDC) and Snowflake basics Lab
3How to use snowpark and Snowflake optimization Lecture
4How to use snowpark and Snowflake optimization Lab
5Growth Accounting Models Lecture
6Growth Accounting Models Lab
7Grouping sets and Window functions Lecture
8Grouping sets and Window functions Lab

dbt basics

1dbt basics: How dbt works: sources, models, packages Lecture
2dbt basics: How dbt works: sources, models, packages Lab
3dbt basics: creating a full dbt project lecture
4dbt basics: creating a full dbt project lab

Advanced dbt

1Advanced dbt: Understanding incremental models Lecture
2Advanced dbt: Understanding incremental models Lab
3Advanced pipelines: Blue-green and WAP with Airflow Lecture
4Advanced pipelines: Blue-green and WAP with Airflow Lab

Advanced Snowflake

1Advanced snowflake: Deploying the Capstone dag Lecture
2Advanced snowflake: Deploying the Capstone dag Lab
3Advanced Snowflake: Data Modeling and UDFs Lecture
4Advanced Snowflake: Data Modeling and UDFs Lab

Passing Analytics Engineer Interviews

1SQL

Bonus

1Complex Data Types Lecture
2Complex Data Types Lab
3Slowly Changing Dimensions Lecture
4Slowly Changing Dimensions Lab
5Graph Data Modeling Lecture
6Graph Data Modeling Lab
7Fact Data Modeling Lecture
8Fact Data Modeling Lab
9Data modeling with Date lists Lecture
10Data modeling with Date lists Lab

Tech Talk

1Tech talk with Andrew Madson
2Tech talk with Gourav Khanijoe
3Tech talk with Carly Taylor
4Tech talk with Jess Ramos
5Tech talk with Madison Schott
6Tech talk with Vaishali Macwan
7Tech talk with Tim Chan

TA Office Hours

1Office Hour 1
2Office Hour 2
3Office Hour 3
4Office Hour 4
Included2025 New Year Data Engineering Boot Camp starting January 6th
72 lessons

January 2025 Bootcamp Kickoff

1January 2025 Bootcamp Kickoff

Airflow + Trino

1Orchestration and Airflow Fundamentals Day 1 Lecture
2Orchestration and Airflow Fundamentals Day 1 Lab
3Hard Orchestration Lessons Day 2 Lecture
4Hard Orchestration Lessons Day 2 Lab
5Cumulative DAGs in Production Day 3 Lecture
6Cumulative DAGs in Production Day 3 Lab

Snowflake + dbt Basics

1Snowflake Basics Day 1 Lecture
2Snowflake Basics Day 1 Lab
3dbt Basics Day 2 Lecture
4dbt Basics Day 2 Lab
5dbt Basics Day 3 Lecture
6dbt Basics Day 3 Lab

Databricks Basics

1Databricks Platform Overview Day 1 Lecture
2Databricks Platform Overview Day 1 Lab
3Introduction to Spark Day 2 Lecture
4Introduction to Spark Day 2 Lab
5Apache Spark Core Day 3 Lecture
6Apache Spark Core Day 3 Lab

Advanced Spark on Databricks

1Apache Spark Shuffle Joins Day 1 Lecture
2Apache Spark Shuffle Joins Day 1 Lab
3Apache Spark Memory Turning, Partitioning Day 2 Lecture
4Apache Spark Memory Turning, Partitioning Day 2 Lab
5Apache Spark Unit Testing Day 3 Lecture
6Apache Spark Unit Testing Day 3 Lab

Snowflake + Advanced dbt

1Snowflake Lecture
2Snowflake Lab
3Advanced dbt Day 2 Lecture
4Advanced dbt Day 2 Lab
5Advanced dbt Day 3 Lecture
6Advanced dbt Day 3 Lab

Analytical Patterns and Advanced SQL

1Applying Analytical Patterns Day 1 Lecture
2Applying Analytical Patterns Day 1 Lab
3Advanced SQL Patterns Day 2 Lecture
4Advanced SQL Patterns Day 2 Lab
5Analytical Patterns Recognizing Business Value Day 3 Lecture
6Analytical Patterns Recognizing Business Value Day 3 Lab

Real Time Data (Spark and Kafka Streaming)

1Advanced Spark Optimization Techniques Day 1 Lecture
2Advanced Spark Optimization Techniques Day 1 Lab
3Spark Structured Streaming Day 2 Lecture
4Spark Structured Streaming Day 2 Lab
5Deep Dive On Workflows Day 3 Lecture
6Deep Dive On Workflows Day 3 Lab

Bonus - LLMs

1RAG and LLMs Day1 Lecture
2RAG and LLMs Day1 Lab
3RAG and LLMs Day2 Lecture
4RAG and LLMs Day2 Lab
5LLMs Day 3 Part1
6LLMs Day3 Part2

Career Development Sessions

1Career Development - LinkedIn Optimization
2Career Development - Resume Review
3Career Development - Interview Help
4Career Development - Data Modeling Interview
5Career Development - Strategic Networking

Q&A with Zach Wilson

1Q&A Week 1
2Q&A Week 2
3Q&A Week 3
4Q&A Week 4

Guest Speaker Sessions

1Jason Reid (cofounder of Tabular)
2Shachar Meir
3Brian Pulliam
4Sundas Khalid
5YZ
6Prasad Rao
7Joe Reis

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6

Capstone Showcase Jan 2025

1Capstone Showcase Jan 2025
IncludedBoot Camp: Community Edition
41 lessons

Bootcamp Orientation

1Bootcamp Kickoff
2Boot Camp Database Setup

Dimensional Data Modeling

1Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lecture
2Dimensional Data Modeling Complex Data Type and Cumulation Day 1 Lab
3Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lecture
4Dimensional Data Modeling: Building Slowly Changing Dimensions Day 2 Lab
5Dimensional Data Modeling: Graph Data Modeling Day 3 Lecture
6Dimensional Data Modeling: Graph Data Modeling Day 3 Lab

Fact Data Modeling

1Fact Data Modeling: Core Concepts, Deduplication Day 1 Lecture
2Fact Data Modeling: Practical Insights into Data Modeling Day 1 Lab
3Fact Data Modeling: Core Elements in Data Modeling Day 2 Lecture
4Fact Data Modeling: Compact Tables for Efficient Data Representation Day 2 Lab
5Fact Data Modeling: Minimizing Shuffle and Reducing Facts Day 3 Lecture
6Fact Data Modeling: Practical Guide to Formatting and Aggregating Data Day 3 Lab

Apache Spark Fundamentals

1Apache Spark: Architecture, Optimization, and Best Practices Day 1 Lecture
2Apache Spark: Hands-On for Broadcast and Hash Joins Day 1 Lab
3Apache Spark: Managing Spark Jobs and Notebooks Day 2 Lecture
4Apache Spark: User-Defined Functions and Broadcast Join Day 2 Lab
5Unit Testing Spark Jobs: Importance, Challenges, and Leadership Perspectives Lecture
6Unit Testing Spark Jobs: Mastering Spark and PySpark Testing Lab

Applying Analytical Patterns

1Applying Analytical Patterns: Exploring SQL, Scaling Projects and Aggregation Analysis Day 1 Lecture
2Applying Analytical Patterns: Mastering Growth Accounting and Retention Analysis Day 1 Lab
3Applying Analytical Patterns: Recursive CTEs and Window Functions Day 2 Lecture
4Applying Analytical Patterns: Aggregations and Cardinality Reduction Day 2 Lab

Real-time pipelines with Flink and Kafka

1Flink Lab Setup
2Streaming Pipelines: Mastering Streaming and Real-time Pipelines Day 1 Lecture
3Streaming Pipelines: Setting up Streaming Pipelines Day 1 Lab
4Streaming Pipelines: Exploring Data Collection and Processing Day 2 Lecture
5Streaming Pipelines: Kafka, Postgres, Spark Integrations and Parallelism Day 2 Lab

Data Visualization and Impact

1Data Visualization and Impact: Mastering Data Engineering Day 1 Lecture
2Data Visualization and Impact: Hands-On with the CSV files Day 1 Lab
3Data Visualization and Impact: Insights and Best Practices Day 2 Lecture
4Data Visualization and Impact: Exploring Data Visualization and Aggregation Techniques Day 2 Lab

Data Pipeline Maintenance

1Data Pipeline Maintenance: Navigating the Complexities of Data Engineering Day 1 Lecture
2Data Pipeline Maintenance: Strategies for Maintenance and Dock Building Day 2 Lecture

KPIs and Experimentation

1KPIs and Experimentation: Decoding Business Success: Metrics, Growth Strategies and Collaborative Approaches Day 1 Lecture
2KPIs and Experimentation: Setting up and Analysing Experiments Day 1 Lab
3KPIs and Experimentation: Leading and Lagging Metrics Day 2 Lecture

Data Quality Patterns

1Data Quality Patterns: MIDAS Process from Airbnb Day 1 Lecture
2Data Quality Patterns: Spec-Building Document Day 1 Lab
3Data Quality Patterns: WAP Patterns Day 2 Lecture
IncludedSpring Data Engineering Boot Camp Starting May 26th, 2025
46 lessons

Data Lakes with Delta Table

1Delta Table Day 1 Lecture
2Delta Table Day 1 Lab
3Delta Lake Bonus
4Delta Table Day 2 Lecture
5Delta Table Day 2 Lab
6Test Again

Databricks and Advanced Spark

1Apache Spark Shuffle Joins Day 1 Lecture
2Apache Spark Shuffle Joins Day 1 Lab
3Apache Spark Memory Turning, Partitioning Day 2 Lecture
4Apache Spark Memory Turning, Partitioning Day 2 Lab
5Apache Spark Unit Testing Day 3 Lecture
6Apache Spark Unit Testing Day 3 Lab
7Setting Up CI/CD and Unit Testing in Databricks for Reliable Data Pipelines

Structured Streaming Kafka to Delta Live Table

1Apache Spark programming with Databricks Day 1 Lecture
2Apache Spark programming with Databricks Day 1 Lab
3Apache Spark programming with Databricks Day 2 Lecture
4Apache Spark programming with Databricks Day 2 Lab

Managing Unstructured Data

1Managing Unstructured Data - Day 1 Lecture
2Managing Unstructured Data - Day 1 Lab
3Managing Unstructured Data - Day 2 Lecture
4Managing Unstructured Data - Day 2 Lab

Databricks and Spark Fundamentals (preread)

1Spark Basics

Q&A with Zach May 2025

1Q&A with Zach Week 1
2Q&A with Zach Week 2
3Q&A with Zach Week 3
4Q&A with Zach Week 4
5Q&A with Zach Week 5
6Navigating Data Engineering -Tips, Tools, and Career Insights

Bonus - Azure

1Azure - Week 1
2Azure - Week 2
3Azure - Week 3
4Azure - Week 4
5Azure - Week 5

Tech Talks May 2025

1Tech Talk - Alex
2Tech Talk - Joe
3Tech Talk - Shubham
4Tech Talk - Vaishali

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6
7TA Office Hour 7
8TA Office Hour 8

Capstone DE Bootcamp May 2025

1Capstone May 2025
IncludedAnalytics Engineering Boot Camp starting April 14th, 2025
44 lessons

Data Modeling on Iceberg

1Apache Iceberg Lecture Day 1
2Apache Iceberg Lab Day 1
3Apache Iceberg Lecture Day 2
4Apace Iceberg Lab Day 2
5Apache Iceberg Lecture Day 3
6Apache iceberg Lab Day 3

Data Modeling on Snowflake

1Conceptual Data Modeling Lecture Day 1
2Conceptual Data Modeling Lab Day 1
3How to use Snowpark and Snowflake Optimizations Lecture Day 2
4How to use Snowpark and Snowflake Optimizations Lab Day 2
5Advanced Snowflake Data Modeling and UDFs Lecture Day 3
6Advanced Snowflake Data Modeling and UDFs Lab Day 3

Snowflake + dbt basics

1dbt basics Day 1 Lecture
2dbt basics Day 1 Lab
3dbt basics Day 2 Lecture
4dbt basics Day 2 Lab

Advanced dbt

1Advanced dbt Lecture Day 1
2Advanced dbt Lab Day 1
3Advanced dbt Lecture Day 2
4Advanced dbt Lab Day 2

End-to-End Project Hacking

1Capstone Project Day 1 Lecture
2Capstone Project Day 1 Lab
3Capstone Project Day 2 Class
4Capstone Project Day 3 Class

Q&A With Zach Wilson

1Q&A with Zach Week 1
2Q&A with Zach Week 2
3Q&A with Zach Week 3
4Q&A with Zach Week 4

TA Office Hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6

Tech Talks

1Tech Talk 1 - Jess Ramos
2Tech Talk 2 - Madison
3Tech Talk 3 - Ben Rogojan
4Tech Talk 4 - John Kutay
5Tech Talk 5 - Carly Taylor
6Tech Talk 6 - Vin Vashishtha
7Tech Talk 7 - Parth Parekh
8Tech Talk 8 - Prasad Rao

Career Development

1Build-a-Resume!
2LinkedIn Like Pro!
IncludedAI Engineering Boot Camp Starting on July 7th, 2025
47 lessons

Prompt Engineering

1Day 1 Lecture: Prompt Engineering Theory
2Day 1 Lab: Auto Prompt Optimization with dspy
3Day 2 Lecture: Optimizing Dev Workflows with Cursor and Windsurf
4Day 2 Lab: Vibe Coding Full Stack application in 45 minutes
5Supplemental Session on Auto-prompt Optimization

Understanding RAG

1Day 1 Lecture: RAG Fundamentals and types of RAG
2Day 1 Lab: Vectorizing Inputs into Milvus
3Day 2 Lecture: RAG Tradeoffs
4Day 2 Lab: Integrating RAG into your full stack app

Advanced RAG and Agentic AI

1Day 1 Lecture: Graph RAG vs Vector RAG
2Day 1 Lab: Building Graph RAG in production
3Day 2 Lecture: Types of Agents (reflection, plan-and-execute, rewoo)
4Day 2 Lab: Enhancing RAG with Agentic AI

MLOps

1Day 1 Lecture: Trusting your workflows (evaluation metrics and guardrails)
2Day 1 Lab: Setting up CI/CD for an AI application
3Day 2 Lecture: Understanding LLM Reasoning traceability
4Day 2 Lab: Tracing LLM reasoning with LangChain + LangSmith
5Supplemental lab session on GuardrailsAI

Building end-to-end AI Application

1Day 1 Lecture: Leveraging MCP and tools to integrate effectively
2Day 1 Lab: Using MCP to build more powerful agents
3Day 2 Lecture: How to link your AI agent to value
4Day 2 Lab: Building an asynchronous vs synchronous agent

Tech Talks July 2025

1Tech Talk - Li
2Tech Talk - Conor
3Tech Talk - Vikram
4Tech Talk - Ryan
5Tech Talk - Chip

Q&A with Zach July 2025

1Q&A with Zach Week 1
2Q&A with Zach Week 2
3Q&A with Zach Week 3
4Q&A with Zach Week 4
5Q&A with Zach Week 5

TA Office hours

1TA Office Hour 1
2TA Office Hour 2
3TA Office Hour 3
4TA Office Hour 4
5TA Office Hour 5
6TA Office Hour 6
7TA Office Hour 7
8TA Office Hour 8

Career Dev Sessions

1Career Dev Session 2
2Test Lesson
3Career Dev Session 4

Book Club

1Book Club 1
2Book Club 2
3Book Club 3
4Book Club 4
IncludedBuilding Pipelines with Iceberg and Airflow
11 lessons

Data Modeling with Iceberg and Trino

1The History of Data Lakes Lecture
2Iceberg Partitioning and Metadata Exploration Lab
3Mastering Data Lake Architectures Lecture
4Apache Iceberg Day 2 Lab
5Apache Iceberg Day 3 Lecture
6Apache Iceberg Day 3 Lab

Airflow Pipelines with Iceberg

1Setting Up Airflow for Week1 for Mac
2Orchestration and Airflow Fundamentals Lecture
3Orchestration and Airflow Fundamentals Lab
4Apache Iceberg Data Contracts Lecture
5Apache Iceberg Data Contracts Lab
IncludedWinter 2026 Analytics Engineering Camp
67 lessons

Bootcamp Kick off

1Analytics Engineering Bootcamp Kickoff
2Data Expert.io Architecture
3Onboarding Guide for the Data and AI Engineering Bootcamp

Airflow

1How to build job orchestration that matters
2Injecting quality into your Airflow DAG Lab
3How to load data into Iceberg with Airflow
4Airflow Day2 Lab
5Backfilling with Airflow

Mastering Analytical Patterns

1How to do Growth Accounting like Meta
2Growth Accounting Lab
3Mastering Change Data Capture (CDC)
4Change Data Capture Lab
5Snowflake and Glue Catalog Integration Lab
6Understanding Read Streams and SCD Implementation Lab

dbt basics

1Build your first pipeline with dbt
2dbt basics Day1 Lab
3dbt basics Day2 Lecture
4dbt basics Day2 Lab

advanced dbt

1advanced dbt Day 1 Lecture
2advanced dbt Day 1 Lab
3advanced dbt Day 2 Lecture
4advanced dbt Day 2 Lab

Advanced Snowflake

1Master Advanced Snowflake Data Modeling with the VARIANT data type
2How to connect Iceberg and Snowflake with Catalogs
3Advance Snowflake Day 2 Lecture
4Advance Snowflake Day 2 Lab

Q&A with Zach

1Q&A with Zach Week1
2Q&A with Zach Week2
3Q&A with Zach Week3
4Q&A with Zach Week4
5Q&A with Zach Week5

Tech Talks

1Tech Talk - Joe
2Tech Talk - Madison
3Tech Talks - Carly
4Tech Talk - Raaid
5Tech Talk - Parth

TA Office hours

1TA Office hour 1
2TA Office hour 2
3TA Office hour 3
4TA Office hour 4
5TA Office hour 5
6TA Office hour 6
7TA Office hour 7
8TA Office hour 8

Career Dev Sessions

1Career Dev Session 1
2Career Dev Session 2
3Career Dev Session 3
4Career Dev Session 4
5Career Dev Session 5

Book Club

1Book Club 1
2Book Club 2
3Book Club 3
4Book Club 4
5Book Club 5

Data Modeling on Iceberg

1Apache Iceberg Lecture Day1
2Apache Iceberg Lab Day1
3Apache Iceberg Lecture Day2
4Apache Iceberg Lab Day2
5Apache Iceberg Lecture Day3
6Apache Iceberg Lab Day3

Data Modeling on Snowflake

1Conceptual Data Modeling Lecture Day 1
2Conceptual Data Modeling Lab Day 1
3How to use Snowpark and Snowflake Optimizations Lecture Day 2
4How to use Snowpark and Snowflake Optimizations Lab Day 2
5Advanced Snowflake Data Modeling and UDFs Lecture Day 3
6Advanced Snowflake Data Modeling and UDFs Lab Day 3

Capstone Project

1Lesson 1

Before you join

Prerequisites

Proficiency in Python and SQL
Basic understanding of Docker, Flink, and Kafka

Platform Access Included

DatabricksDatabricks
AstronomerAstronomer
DiscordDiscord
GitHubGitHub
AWSAWS
SnowflakeSnowflake
OpenAIOpenAI
anthropic

Scholarship Available

We believe great learning shouldn't be limited by finances. Apply for a partial or full scholarship to access this program.

Apply Now

Social proof

What students say