Data & AI 90 days 2-3 hours/day updated 2026-06-01
DataOps 90-Day Learning Path
Build DataOps skills in 90 days: data pipeline automation, data quality frameworks, metadata management, observability for data, and DataOps culture for analytics teams.
What DataOps means
DataOps applies the principles of DevOps and agile to the analytics and data engineering lifecycle. It encompasses automated data pipeline testing, data quality monitoring, metadata and lineage management, and collaboration practices that reduce the time from raw data to trusted business insight. DataOps treats data as a product with SLAs, owners, and continuous improvement loops.
Who should follow this path
- Data engineers who want to add operational maturity
- Analytics engineers building dbt-driven data warehouses
- Data platform engineers supporting self-serve analytics
- DevOps engineers moving into data infrastructure
- Data scientists frustrated by unreliable data pipelines
Prerequisites
- SQL proficiency and data warehouse concepts
- Basic Python or Scala for data scripting
- Familiarity with at least one data pipeline tool (Airflow or Spark)
- Understanding of cloud object storage (S3, GCS)
- Basic understanding of DevOps and CI/CD concepts
The 90-day plan
Daily study recommendation: 2-3 hours/day, six days a week. Consistency beats intensity — block the time in your calendar like a meeting.
Days 1–15: Foundation
- DataOps principles and manifesto
- Data pipeline architecture patterns
- Data warehousing vs data lakehouse concepts
- Medallion architecture (bronze/silver/gold layers)
- Data product thinking and data contracts
Outcome: Design a DataOps architecture with clear pipeline stages and data product ownership.
Days 16–30: Core concepts
- Apache Airflow: DAG authoring and scheduling
- dbt (data build tool) fundamentals
- Data quality with Great Expectations and dbt tests
- Pipeline testing: unit, integration, and contract tests
- CI/CD for data pipelines with GitHub Actions
Outcome: Build a tested, CI/CD-enabled data pipeline using Airflow and dbt.
Days 31–45: Tools and workflows
- Data observability with Monte Carlo or Soda
- Data lineage with Apache Atlas or OpenMetadata
- Metadata management and data catalogs
- Schema evolution and backward compatibility
- Alerting on data freshness, volume, and distribution anomalies
Outcome: Implement data observability across a pipeline with automated anomaly detection and lineage tracking.
Days 46–60: Hands-on projects
- Delta Lake / Apache Iceberg table formats
- Change data capture (CDC) with Debezium
- Streaming data pipelines with Apache Kafka
- Data versioning with DVC
- Data access control and column-level security
Outcome: Implement streaming ingestion with CDC and versioned data lake storage.
Days 61–75: Advanced practices
- Data SLA definition and monitoring
- Incident response for data pipeline failures
- Cost optimization for cloud data warehouses
- Data mesh architecture principles
- DataOps team topologies and collaboration patterns
Outcome: Define and enforce data SLAs and design a data mesh architecture for a multi-domain organization.
Days 76–90: Portfolio, interview & certification prep
- DataOps portfolio project: end-to-end data product
- Preparing for dbt Analytics Engineering certification
- DataOps interview questions
- Metrics: pipeline reliability, data freshness, quality scores
- Emerging topics: lakehouse governance and data contracts
Outcome: Ship a portfolio DataOps project with full observability and be ready for data engineering interviews.
Weekly outcomes at a glance
| Phase | Outcome |
|---|---|
| Days 1–15 | Design a DataOps architecture with clear pipeline stages and data product ownership. |
| Days 16–30 | Build a tested, CI/CD-enabled data pipeline using Airflow and dbt. |
| Days 31–45 | Implement data observability across a pipeline with automated anomaly detection and lineage tracking. |
| Days 46–60 | Implement streaming ingestion with CDC and versioned data lake storage. |
| Days 61–75 | Define and enforce data SLAs and design a data mesh architecture for a multi-domain organization. |
| Days 76–90 | Ship a portfolio DataOps project with full observability and be ready for data engineering interviews. |
Tools to learn
- Apache Airflow
- dbt (data build tool)
- Great Expectations
- Apache Kafka
- Debezium
- Delta Lake
- OpenMetadata
- Monte Carlo
- Soda
- Snowflake
- Apache Spark
- DVC
Labs to practice
Mini projects
- Build an end-to-end dbt + Airflow pipeline with Great Expectations data quality gates and OpenMetadata lineage
- Implement a CDC pipeline using Debezium + Kafka to stream Postgres changes into a Delta Lake data lakehouse
- Create a data observability dashboard with Monte Carlo tracking freshness, volume, and schema changes
Interview questions to prepare
- What is the difference between a data warehouse, data lake, and data lakehouse?
- How do you implement data quality checks in a production pipeline?
- Explain the medallion architecture and when you would use each layer.
- What is change data capture and how does Debezium work?
- How do you handle schema evolution without breaking downstream consumers?
- What is a data contract and why is it important for DataOps?
- How would you implement CI/CD for a dbt project?
- What metrics would you use to measure data pipeline health?
Certification suggestions
- dbt Analytics Engineering Certification — dbt Labs
- Databricks Certified Data Engineer Associate — Databricks
- Snowflake SnowPro Core Certification — Snowflake
- Google Professional Data Engineer — Google Cloud
Browse the full certification registry for exam details and official links.
Free resources
- dbt Learn — Free Courses
- Apache Airflow Documentation
- Great Expectations Documentation
- DataOps Manifesto
- Databricks Delta Lake Documentation
Related roadmaps
Related tool categories
- DataOps Tools
- Workflow Orchestration Tools
- Monitoring Tools
- Database DevOps Tools
- Observability Tools
// instructor-led option
Prefer live, guided training with mentors and certification support? DevOpsSchool.com runs paid instructor-led programs that pair well with this free path.
Explore paid training on DevOpsSchool.com ↗