Master data engineering and AI like a big tech engineer

Real Time Formula 1 Analytics

Big Bag Data

Chess.com Analytics

Trading Strategy for Crypto Currency - Analytics Engineering Capstone Submission

BetFlow - Real Time Sports Betting App

Showcase of student projects

Capstone Projects

Our students have gone on to work at companies like Meta, Airbnb and Amazon. As well as achieve 100% raises!

Fast Track Your Career

Immediate free cloud access to Databricks, AWS, Snowflake, Astronomer, and more!

Free Cloud Access with tons of hands on exercises

Weekly Guest Speaker Sessions

Why Choose DataExpert.io Academy?

In this video, Zach walk you through the setup and execution of a simple data pipeline using Airflow, focusing on reading data from Kafka and storing it in a production table. He covers key components like ExecutionTimeout, MaxActiveRuns, and the importance of data quality checks. He demonstrates how to handle missing data and ensure our pipeline is idempotent, meaning it won't create duplicates when rerun. He also highlights the significance of staging tables and the write-audit-publish pattern for maintaining data integrity.

Injecting quality into your Airflow DAG Lab

academy/2/course/1907/w1d1labf_1767870899018/transcription.json

Sign in to view content

Injecting quality into your Airflow DAG Lab

Description