
Format: Live online sessions using Google meet or MS Teams with hands-on coding, mini-projects, and a capstone project by an industry expert.
Target Audience: College Students, Professionals in Finance, HR, Marketing, Operations, Analysts, and Entrepreneurs
Tools Required: Laptop with internet
Trainer: Industry professional with hands on expertiseLive Online Docker Course for Data Engineering
Week 1: Introduction to Containers & Docker Basics (Beginner)
Sessions: 2 × 3–4 hours
- Introduction to Containerization
- Virtualization vs. containerization
- Benefits for data engineering pipelines
- Docker Architecture Overview
- Installing Docker
- Docker Desktop / Docker Engine
- Basic commands:
docker run,docker ps,docker stop,docker rm
- Running Your First Container
- Hands-on Lab:
- Run multiple containers, inspect logs, and clean up containers
Week 2: Docker Images, Dockerfile & Basic Pipelines (Beginner → Intermediate)
Sessions: 2 × 3–4 hours
- Docker Images Basics
- Pull, tag, inspect, remove images
- Docker Hub and public/private images
- Dockerfile Fundamentals
- Commands: FROM, RUN, COPY, CMD, EXPOSE
- Build reproducible environments
- Building Custom Images
- Image Optimization
- Layering, caching, reducing image size
- Hands-on Lab:
- Build a custom Python ETL image
- Run data ingestion script inside container
Week 3: Networking, Volumes & Docker Compose (Intermediate)
Sessions: 2 × 3–4 hours
- Docker Networking Basics
- Bridge, Host, None networks
- Container-to-container communication
- Persistent Storage
- Volumes vs. bind mounts
- Sharing and persisting data across containers
- Docker Compose Fundamentals
- Multi-container orchestration with
docker-compose.yml - Environment variables & secrets management
- Multi-container orchestration with
- Data Engineering Pipelines with Compose
- Example: Kafka → Spark → PostgreSQL
- Scaling services
- Hands-on Lab:
- Deploy a mini pipeline using Docker Compose
Week 4: Logging, Monitoring, Security & Private Registries (Intermediate → Advanced)
Sessions: 2 × 3–4 hours
- Container Logging
- Log drivers, logging best practices
- Collecting logs for ETL processes
- Monitoring Containers
- Introduction to Prometheus and Grafana
- Monitoring resource usage of containers
- Security Best Practices
- Secure images, scan vulnerabilities
- User permissions, secrets, and environment management
- Private Registries
- Push/pull images to AWS ECR, Azure ACR, Docker Hub private
- Hands-on Lab:
- Secure and monitor Spark + PostgreSQL container setup
Week 5: CI/CD, Kubernetes Intro & Capstone Project (Advanced)
Sessions: 2 × 3–4 hours
- Docker in CI/CD Pipelines
- Integrate Docker with Jenkins, GitHub Actions, Airflow
- Introduction to Kubernetes for Data Engineers
- Pods, Deployments, Scaling containers
- When to move from Docker Compose to Kubernetes
- Capstone Project: Containerized ETL Pipeline
- Airflow + Spark + PostgreSQL + MinIO
- Multi-stage deployment using Docker images
- Project Review & Presentations
- Peer review and instructor feedback
- Best practices recap, Q&A
Key Learning Outcomes After 5 Weeks
- Master Docker architecture, containers, images, and Dockerfiles.
- Build and manage multi-container data pipelines using Docker Compose.
- Implement persistent storage, networking, logging, and monitoring.
- Apply container security best practices.
- Integrate Docker with CI/CD pipelines.
- Gain a foundational understanding of Kubernetes for scaling data workflows.
- Deploy a real-world containerized data engineering pipeline as a capstone project.
Here you can see Important Links:-
No comments:
Post a Comment