IDEAS, Indian Statistical Institute Kolkata
Certificate programme recommended for
Final Year Students and Freshers:
Freshers/individuals pursuing Bachelor's/Master's degree, aspiring for a career in the field of Data Engineering, Individuals with basic programming knowledge looking forward to build their career in the IT industry
Working Professionals:
Junior and mid-career professionals looking for an accelerated career growth and salary hike
Prerequisites:
Basic knowledge of Programming and RDBMS (Relational Database Management System)
About the programme
This online certificate programme has been created by IDEAS - Technology Innovation Hub (TIH) of ISI Kolkata in collaboration with TCS iON. IDEAS - Technology Innovation Hub was established by the Indian Statistical Institute, Kolkata, under the aegis of the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS), Department of Science and Technology (DST), Government of India (Bharat).
Applied Agentic AI for Modern Data Engineering bridges the transformative power of Artificial Intelligence with the dynamic world of Data Engineering. In this era of autonomous and intelligent systems, Agentic AI represents the next frontier; where AI agents independently reason, plan, and act across complex data ecosystems.
This programme provides a strong foundation in modern Data Engineering principles while integrating Agentic AI approaches that automate and optimise data workflows, pipelines, and decision-making processes. Learners will explore how Data Engineering fuels intelligent systems; from data ingestion and transformation to deployment and orchestration of AI-driven data platforms.
Read More
Key highlights
50 hours of overall experiential learning
Live Lectures from TCS Experts and IDEAS, ISI Kolkata faculty
Industry projects and use cases
Live doubt clearing sessions by the experts
The participants are expected to have developed the following skills by the end of the programme:
- Understand and apply the full lifecycle of Data Engineering and leverage with AI Agents, including data ingestion, storage, processing, orchestration, and management, using real-world business cases. Learn to leverage the end-to-end process using AI Agents
- Gain proficiency in modern data storage technologies, including relational, NoSQL, data warehouses, distributed file systems, and lakehouse architectures, use of open data lakehouse with Apache Iceberg, Presto, and more
- Be industry-ready with hands-on experience in DataOps, containerization (Docker, Kubernetes), and orchestration tools (Airflow, Prefect, Dagster) for managing distributed data workflows. Learn the use of AI Agents to accomplish and automate the tasks
- Master data ingestion technologies and pipeline management, including batch and real-time ingestion using tools like Kafka, Flume, Airbyte, and Fivetran, with a focus on data quality and security
- Learn the fundamentals of Machine Learning automation using MLOps platforms and pipelines, including SparkML, MLflow, and Kubeflow, with applications in LLM and RAG systems
- Leverage Agentic AI to automate and optimise Data Engineering tasks, enabling intelligent, self-correcting data systems and proactive data strategy development
Learning Outcomes
Develop Hands-on expertise in:
n8n and Microsoft AutoGen
Code Interpreter, Databricks, Dataiku
LangChain, Matillion, Python
Talend, Trifacta, dbt
Airbyte, Fivetran, MongoDB
MySQL, PostgreSQL
AWS EC2, AWS S3, Delta Lake
HDFS, Snowflake
Airbyte, Apache Iceberg
Fivetran, Presto, Snowflake
CDC, Kafka
Airflow, HDFS, S3, Spark
Airflow, Dagster, Prefect
Dataiku, Dify
Kubeflow, MLOps
MLflow, RAG
RAGFlow, SparkML
AgentOps
The field generates a variety of job roles, including but not limited to:
The field generates a variety of job roles, including but not limited to:
Data Intelligence Engineer
Agentic Data Engineer
AI Data Architect
Agentic MLOps Engineer
Business Intelligence Engineer
Agentic DataOps Engineer
Programme pedagogy
Expert-led live sessions
Engage in dynamic training sessions conducted by distinguished faculty from IDEAS, ISI Kolkata along with seasoned industry professionals.
Recorded session videos
Access educational content on-demand, available for learning on any device, at any time, ensuring flexibility and convenience for revisiting the material.
Industry use cases and simulations
Enhance your understanding of complex business situations by participating in simulations and case studies that mirror complex business environments, providing deep, experiential learning.
Peer networking and expert connect
Expand your professional network and engage with experts through our interactive community platforms, facilitating enhanced learning and problem-solving.
Hands-on learning experiences
Participate in practical sessions where you will work with real-world data and public datasets, fostering a deep, experiential understanding of the subject.
Live doubt solving sessions
Address your queries in real-time with direct access to experts during live sessions, ensuring clarity and immediate assistance.
Dedicated learning management team
Receive continual guidance and support from our committed learning management team, tailored to meet your educational needs and enhance your learning journey.
Programme Syllabus Overview
This comprehensive syllabus covering the latest industry practices and techniques on Data Engineering is structured into 11 modules spanning across 12 weeks. This course is designed to guide the learners through the intricacies of the concepts, the latest tools and technologies as per the industry standards with absolute hands-on implementations on case studies and scenarios. Each module builds on the knowledge from the previous, ensuring a cohesive and thorough understanding from the basic to the advanced topics leveraging extensive hands-on implementations.
- Introduction to Generative AI and LLMs
- What is Agentic AI?
- Agentic AI Use Cases
- Introduction to Data Engineering, Understand Data Engineering Lifecycle
- Data Engineering Tools/Technologies/Use Cases
- Python Programming Overview for Data Engineering
- Using Agentic AI as a Tool for Data Engineering
- Understand ETL vs ELT
- Basic SQL Transformations and Database Concepts
- Relational Databases/Object Relational Databases and its Implementation
- AI-assisted ETL Script Generation
- Data Warehouses: Concepts, Star and Snowflake Schemas, OLAP, Data Marts, OLAP vs OLTP
- Distributed File Systems: Hadoop HDFS, Google Cloud Storage, Amazon S3. Implementation of Hadoop Cluster and AWS S3
- AI-assisted Schema Design
- Working with open Data Lakehouse with Presto and Apache Iceberg. Scalable query handling using Presto
- Ingest Data from Multiple Sources
- Automate Ingestion Scripts Using AI
- Stream Processing Fundamentals and Pipelines Using Apache Kafka
- Real-Time Data Ingestion with CDC
- AI-guided Kafka Pipeline Creation
- Essentials of Apache Spark Programming
- Working with Spark SQL
- Working with DAGs Using Apache Airflow
- Ingesting Data Using Apache Spark from Multiple Storage Types and Formats
- Role of Agentic AI in Data Pipelines
- Use AI Agent to Draft a Simple ETL pipeline
- Working with SparkML
- Working with MLOps-enabled RAG Pipeline
- The AgenticOps Components - Development, Integration, Deployment, Operations, Feedback/Evaluation, Governance
- The AgenticOps Lifecycle
- End-to-End RAG with Data Lakehouse Integration
- Streaming RAG for Real-Time Chatbots
- LLM + RAG Data Lineage Tracking System
- Bias and Hallucination Reduction in RAG via Data Engineering
- AI Agents in ETL
- Agentic RAG
- Evaluating Data Freshness in RAG Pipelines
This syllabus is designed to not only impart essential knowledge and hands-on implementation on the advanced Data Engineering with Agentic AI concepts, but also to enable learners to implement the practical applications through case studies, simulations, and hands-on projects, making participants absolutely industry-ready for the high demand job scenarios.
Programme Structure
The programme covers the latest industry practices and techniques on Data Engineering with Agentic AI. The programme is structured into 11 modules, including one project module, spanning 12 weeks.
Digital Certificate
Learners will be awarded a co-branded digital certificate upon successful completion of the programme.
FAQs
Please take a look at the most frequently asked questions; you might have your query answered here.
- Click on the "Buy Now" button.
- Login with your TCS iON Digital Learning Hub credentials or sign up as a new user.
- After login/sign up, you will be asked to share your details required to complete your purchase. This includes your name, email ID, phone number and other details.
- On successful submission of the form, you need to proceed to make the payment by clicking on "Click to Pay".
-
The course has been meticulously designed for both engineering students and working professionals, with more emphasis on hands-on experience and practical exposure to cutting edge tools and technologies. Learners will gain a comprehensive understanding about the essential building blocks and working principles of Data Engineering technologies with the help of industry and academic experts through live lectures and various hands-on tools used by the industry.