IDEAS, Indian Statistical Institute Kolkata
Certificate programme recommended for
Final Year Students and Freshers:
Freshers/individuals pursuing Bachelor's/Master's degree, aspiring for a career in the field of Data Engineering, Individuals with basic programming knowledge looking forward to build their career in the IT industry
Working Professionals:
Junior and mid-career professionals looking for an accelerated career growth and salary hike
Prerequisites:
Basic knowledge of Programming and RDBMS (Relational Database Management System)
About the programme
This online certificate programme has been created by IDEAS - Technology Innovation Hub (TIH) of ISI Kolkata in collaboration with TCS iON. IDEAS - Technology Innovation Hub was established by the Indian Statistical Institute, Kolkata, under the aegis of the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS), Department of Science and Technology (DST), Government of India (Bharat).
Applied Agentic AI for Modern Data Engineering bridges the transformative power of Artificial Intelligence with the dynamic world of Data Engineering. In this era of autonomous and intelligent systems, Agentic AI represents the next frontier; where AI agents independently reason, plan, and act across complex data ecosystems.
This programme provides a strong foundation in modern Data Engineering principles while integrating Agentic AI approaches that automate and optimise data workflows, pipelines, and decision-making processes. Learners will explore how Data Engineering fuels intelligent systems; from data ingestion and transformation to deployment and orchestration of AI-driven data platforms.
Read More
Key highlights
50 hours of overall experiential learning
Live Lectures from TCS Experts and IDEAS, ISI Kolkata faculty
Industry projects and use cases
Live doubt clearing sessions by the experts
The participants are expected to have developed the following skills by the end of the programme:
- Understand and apply the full lifecycle of Data Engineering and leverage with AI Agents, including data ingestion, storage, processing, orchestration, and management, using real-world business cases. Learn to leverage the end-to-end process using AI Agents
- Gain proficiency in modern data storage technologies, including relational, NoSQL, data warehouses, distributed file systems, and lakehouse architectures, use of open data lakehouse with Apache Iceberg, Presto, and more
- Be industry-ready with hands-on experience in DataOps, containerization (Docker, Kubernetes), and orchestration tools (Airflow, Prefect, Dagster) for managing distributed data workflows. Learn the use of AI Agents to accomplish and automate the tasks
- Master data ingestion technologies and pipeline management, including batch and real-time ingestion using tools like Kafka, Flume, Airbyte, and Fivetran, with a focus on data quality and security
- Learn the fundamentals of Machine Learning automation using MLOps platforms and pipelines, including SparkML, MLflow, and Kubeflow, with applications in LLM and RAG systems
- Leverage Agentic AI to automate and optimise Data Engineering tasks, enabling intelligent, self-correcting data systems and proactive data strategy development
Learning Outcomes
Develop Hands-on expertise in:
Python and SQL (Through refresher sessions)
Code Interpreter
LangChain
Airbyte, Apache Iceberg
MLflow, RAG
HDFS, Snowflake
Cloud (AWS), Data Lake
CDC, Kafka
RAGFlow, SparkML
Kubeflow, MLOps
The field generates a variety of job roles, including but not limited to:
The field generates a variety of job roles, including but not limited to:
Agentic AI Engineer
Data Scientist: Agentic AI and MLOps
Agentic Data Specialist
Data Engineer (Agentic AI)
Data Pipeline Engineer
Agentic AI, AI and Data Specialist
Programme pedagogy
Expert-led live sessions
Engage in dynamic training sessions conducted by distinguished faculty from IDEAS, ISI Kolkata along with seasoned industry professionals.
Recorded session videos
Access educational content on-demand, available for learning on any device, at any time, ensuring flexibility and convenience for revisiting the material.
Industry use cases and simulations
Enhance your understanding of complex business situations by participating in simulations and case studies that mirror complex business environments, providing deep, experiential learning.
Peer networking and expert connect
Expand your professional network and engage with experts through our interactive community platforms, facilitating enhanced learning and problem-solving.
Hands-on learning experiences
Participate in practical sessions where you will work with real-world data and public datasets, fostering a deep, experiential understanding of the subject.
Live doubt solving sessions
Address your queries in real-time with direct access to experts during live sessions, ensuring clarity and immediate assistance.
Dedicated learning management team
Receive continual guidance and support from our committed learning management team, tailored to meet your educational needs and enhance your learning journey.
Programme Syllabus Overview
This comprehensive syllabus covering the latest industry practices and techniques on Data Engineering is structured into 11 modules spanning across 12 weeks. This course is designed to guide the learners through the intricacies of the concepts, the latest tools and technologies as per the industry standards with absolute hands-on implementations on case studies and scenarios. Each module builds on the knowledge from the previous, ensuring a cohesive and thorough understanding from the basic to the advanced topics leveraging extensive hands-on implementations.
- Introduction to Generative AI and LLMs
- What is Agentic AI?
- Agentic AI Use Cases
- Introduction to Data Engineering, Understand Data Engineering Lifecycle
- Data Engineering Tools/Technologies/Use Cases
- Python Programming Overview for Data Engineering
- Using Agentic AI as a Tool for Data Engineering
- Understand ETL vs ELT
- Basic SQL Transformations and Database Concepts
- Relational Databases/Object Relational Databases and its Implementation
- AI-assisted ETL Script Generation
- Data Warehouses: Concepts, Star and Snowflake Schemas, OLAP, Data Marts, OLAP vs OLTP
- Distributed File Systems: Hadoop HDFS, Google Cloud Storage, Amazon S3. Implementation of Hadoop Cluster and AWS S3
- AI-assisted Schema Design
- Working with open Data Lakehouse with Presto and Apache Iceberg. Scalable query handling using Presto
- Ingest Data from Multiple Sources
- Automate Ingestion Scripts Using AI
- Stream Processing Fundamentals and Pipelines Using Apache Kafka
- Real-Time Data Ingestion with CDC
- AI-guided Kafka Pipeline Creation
- Essentials of Apache Spark Programming
- Working with Spark SQL
- Working with DAGs Using Apache Airflow
- Ingesting Data Using Apache Spark from Multiple Storage Types and Formats
- Role of Agentic AI in Data Pipelines
- Use AI Agent to Draft a Simple ETL pipeline
- Working with SparkML
- Working with MLOps-enabled RAG Pipeline
- The AgenticOps Components - Development, Integration, Deployment, Operations, Feedback/Evaluation, Governance
- The AgenticOps Lifecycle
- End-to-End RAG with Data Lakehouse Integration
- Streaming RAG for Real-Time Chatbots
- LLM + RAG Data Lineage Tracking System
- Bias and Hallucination Reduction in RAG via Data Engineering
- AI Agents in ETL
- Agentic RAG
- Evaluating Data Freshness in RAG Pipelines
This syllabus is designed to not only impart essential knowledge and hands-on implementation on the advanced Data Engineering with Agentic AI concepts, but also to enable learners to implement the practical applications through case studies, simulations, and hands-on projects, making participants absolutely industry-ready for the high demand job scenarios.
Programme Structure
The programme covers the latest industry practices and techniques on Data Engineering with Agentic AI. The programme is structured into 11 modules, including one project module, spanning 12 weeks. Each module will contain a Quiz and an Assignment. The participants will be graded on their performance in Quizzes, Assignments and Project. The Quizzes will have an aggregate weightage of 40%, Assignments will have an aggregate weightage of 30% and the Project will have a weightage of 30%.
Digital Certificate
Learners will be awarded a co-branded digital certificate upon successful completion of the programme.
FAQs
Please take a look at the most frequently asked questions; you might have your query answered here.
- Click on the "Buy Now" button.
- Login with your TCS iON Digital Learning Hub credentials, or sign up as a new user.
- Proceed to make the payment by clicking on "Click to Pay".
- You will receive a successful purchase message on your registered email ID/mobile number.
-
This programme builds strong learner proficiency by offering hands-on training in modern data engineering while integrating cutting-edge Agentic AI concepts that automate and optimise real-world data workflows. It equips learners with practical, industry-ready skills through live sessions, real-world case studies and guided practice using current enterprise tools.
Live lectures will be held every Wednesday and Saturday from 7 PM to 9 PM, starting 25th February. For more details, Click Here