Sem2
Skill Level: Beginner
Skill Level: Beginner
Skill Level: Beginner

Master of Computer Application (MCA) Part I Semester II

Course Code: MMT-204
Title of Course: Data Engineering

Internal Marks: 20 External Marks: 30 Theory: 02 hours/week
Course Outcomes: Upon successful completion of this course, the student will be able to:
1. To introduce students to data storage systems and technologies commonly used in
data engineering.
2. To enable students to design and implement databases for efficient data storage and
retrieval.
3. To teach students how to optimize data storage and access patterns for performance.
4. To explore data security and privacy considerations in data engineering.
5. To provide experience in using cloud-based storage and database services.
UNIT I (15 Hours)
Data Pipeline, Data Flow: the flow of data through different stages of processing,
transformation, and storage within a data engineering ecosystem.
Data Storage and Retrieval: Knowledge of various data storage technologies, such as
relational databases (SQL) and NoSQL databases.
Data Processing and Transformation: data processing frameworks and tools, like Apache
Spark, for handling large-scale data processing and transformation tasks.
UNIT II (15 Hours)
Data Integration and ETL (Extract, Transform, Load), Data Warehousing, Big Data
Technologies: such as Hadoop ecosystem components (e.g., HDFS, MapReduce) and Apache
Kafka for handling real-time data streams.
Cloud Computing for Data Engineering: Understanding the benefits of cloud-based data
engineering and working with cloud data storage and processing platforms like AWS, Azure,
and Google Cloud Platform.
References:
1."Data Engineering" by Pramod J. Sadalage and Martin Fowler
2."Data Engineering Cookbook" by Andreas Kretz

Skill Level: Beginner
Skill Level: Beginner
Skill Level: Beginner
Skill Level: Beginner