PG Program in Data Engineering
Programs Detail
- Overview
- Program Highlights
- Program Coverage
- Program Fee
- Eligibility
- Campus
Alumni Testimonials
Schedule and Registration
Association
Affiliation
Program Highlights
The Data Engineering program from Praxis is designed to fulfill the knowledge needs in the area to create professionals who becomes immediately productive for the organization. The program equips professionals with the know-how of existing tools and technologies for Data Management & Data Modelling, introduces to the paradigm of Distributed Systems and Cloud Computing. The participants finally get to work on a Capstone project that sees them taking data from a Legacy system to migrating to a big-data platform hosted on the Cloud.
-
Basics of Data Engineering
-
Scripting Language Requirement: Unix / Linux
-
Basic Language Requirement: Python, Java & Scala
-
Working Knowledge of Operating Systems
-
Deep Database Knowledge – SQL, and NoSQL
-
Data Warehousing – Hadoop, MapReduce, HIVE, Hbase, PIG,
-
Apache Spark, Kafka
-
Familiarity with basic Data Mining Methodologies
-
Stream Data processing from IoT
Learning Outcomes of the Data Engineering course
-
Re-engineer Enterprise Data Architecture without hampering BAU
-
Work with relational and NoSQL data models
-
Create scalable and efficient data warehouses
-
Work efficiently with massive datasets
-
Build and interact with a Cloud-based data warehouse
-
Automate and monitor data pipelines
-
Develop proficiency in Stream Processing using Cloud Data Lake
-
Solve the appropriate us cases using big data technologies
The list of tools used depends on the volume of this data, the speed of their arrival and heterogeneity.
Good companies which innovate and compete on data like Netflix, LinkedIn, Amazon & Google expects to have knowledge of coding, Data Structures and Algorithms complexity.
Admission Process
Scholarship
Praxis Placement Program
Learning Environment
Overview
A data engineer is responsible for providing the reliable infrastructure of the data. A Data Engineering group is accountable for the entire ownership of the data, namely, their acquisition, storage, permission, delivery and processing. These data engineers ensure a smooth flow of data between systems and processes.
ETL (Extract, Transform, and Load) are the steps which a data engineer follows to build the data pipelines. ETL is essentially a blueprint for how the collected raw data is processed and transformed into data ready for analysis.
Data engineers are expected to know a fair bit of programming and familiarity with scripting. It is desirable (though not mandatory) to have engineering background. An acumen towards technology is a necessary requirement to succeed in the job.
The Different Roles in Data Engineering (source: AnalyticsVidhya)
- ETL Engineer
The ETL engineer is responsible to maintain the veracity of the data in the source and target system. They ensure that the right kind of tools, permission and system pipelines are in place for smooth transfer of the data.
- Database Administrator
This role requires extensive knowledge of traditional as well as the new-age NoSQL and Cloud databases. They ensure that the data generating and the data ingesting systems are up and running in a live business scenario.
- Data Engineer
A data engineer lays down the foundation for data management systems to ingest, integrate and maintain all the data sources. The person needs to have working database knowledge and also needs to understand the needs of the business and its long time data scalability needs. This role requires knowledge of tools like SQL, XML, Hive, Pig, Spark etc.
- Enterprise Data Architect
The master of the lot. An Architect needs to have knowledge of database tools, languages like Python, Java and Scala, distributed systems like Hadoop, among other things. It’s a combination of tasks of Database Administrator & Data Engineer into one single role.
Responding to Industry needs
Diagram adapted from Monica Rogati’s excellent article, ‘The AI Hierarchy of Needs’
Program Objective
Program Features
Faculty Profiles
Program Outcome
Program Coverage
- Comprehensive 550 hours full-time classroom based and lab based training
- Specially designed workshops for Corporate Communication skills and Storytelling with Data
- Agile, DevOps and Design Thinking as part of the learning process
Curriculum
Working with traditional Data (Trim I) | Engineering Platforms for Big Data (Trim II) | Running Enterprise business on Cloud (Trim III) |
---|---|---|
Concepts of Data Modeling | Introduction to Hadoop Ecosystem | Understand how enterprise businesses generate data |
Java Programming | Working with NoSQL Databases | Automate Data Pipelines with Apache Airflow |
Algorithms & Data Structure | Big Data middlewares (Sqoop) | Data Stream Analytics with Apache Storm |
Python functional programming (self-paced) | Fundamentals of Scala & Spark | IoT & Sensor Data – Acquisition, Management & Application |
Operating Systems basics (Unix) | Server management with Zookeeper | Building a Data Lake with Apache Spark |
Introduction to RDBMS | Data Pipelines using Apache Kafka | Data Security & Privacy |
SQL programming | Data Visualization using D3JS | Running a Data Lake from the Cloud using Devops |
ETL Concepts | Application orchestestration – Dockers & Kubernets | |
Data Warehousing |
Capstone Project
Capstone project is a multifaceted assignment that serves as a culminating academic and intellectual experience for students. The capstone project gives students the opportunity to apply their classroom learnings real world challenges faced by business. Some examples of capstone projects could be – Migration of a Traditional DW (PostGres based) to a Cloud Big Data Data Warehouse, Building an orchestrated application on the cloud for managing sensor-data, Real-time Streaming feeds analyzing using Kafka & Storm and Real-time Visualization using D3JS.
Electives & VACs
Curriculum
Core Course
Elective Course
Area Of Concentration
Program Fee
All figures in Indian National Rupees
Particulars | Academic Fee | GST | Refundable Deposit | Installment |
Admission Confirmation | 40000 | 7200 | 47200 | |
First Installment | 130000 | 20800 | 10000 | 160800 |
Second Installment | 130000 | 20800 | 150800 | |
Total Fee | 300000 | 48800 | 358800 |
Please note:
*Academic Fee includes Tuition, , Library and Academic activity fee.
Refunds: After admission the entire amount paid by the student shall stand forfeited except the refundable deposit mentioned above. This applies to dismissals as well as withdrawals, voluntary or otherwise, from the institute’s rolls.
Laptop: Each student must have a wi-fi enabled personal laptop before the start of classes. For the data engineering program, the laptop has to be a 64 bit machine and should be equipped with minimum of 6 GB RAM.
Educational Loans: Praxis has tied up with Credila and Avanse Financial Services for student loans, details of which would be furnished on selection.
Disputes: Any disputes are subject to the jurisdiction of Kolkata courts only.
Methodology
Eligibility
- B.Tech, B. Sc/M. Sc Computer Science /IT, BCA or MCA
- Professionals trying to switch careers
- Familiarity with Computer systems
Campus
The 9 months Full Time Post Graduate Program In Data Engineering (PGP DE) is offered from the Bangalore Campus of Praxis Business School.