1. What is Big Data
Introduction to Data Lake
Why do companies care for Big Data
What do we do out of Big Data
Where did this data come from
Social Media – A way to get the true customer insights
2. History of Hadoop
Hadoop Timeline
Why Hadoop
Hadoop 1.X Architecture
Hadoop 1.X Core Components
Hadoop 1.X Job Process
3. Importance of HDFS
HDFS Daemons
Name Node
Data Node
Secondary Name Node
Node Level Failure Handling in Hadoop 1.X
4. Different Phases in Map Reduce
Input – Output formats in each phase
Modeling Real World applications into Map Reduce
Understanding Map Reduce Program Execution
Problems in Map Reduce
5. Hadoop 1.X labs
Setting up a Pseudo Mode Hadoop Cluster
Executing a sample Map Reduce Program
Writing and Understanding Basic Map Reduce Program
6. Apache HIVE
Introduction to Hive Meta store
SQL vs. Hive
Hive Query language
Managed and External tables
Querying data
Hive thrift server
Working on HIVE Beeline
Joins, Sub Queries and other Aggregations
7. Apache PIG
Introduction to PIG
Map Reduce vs. PIG
PIG in local mode
PIG in Map Reduce mode
Local mode vs. Hadoop mode
Execution mechanism and data processing
Writing PIG scripts
User defined functions in PIG
8. SQOOP
Introduction to SQOOP framework
SQOOP flavors of Import
SQOOP flavors of Export
SQOOP CLI Options
9. FLUME
Introduction to Messaging Service
Applications of a Messaging Service
FLMUE Architecture Framework
Working of a FLUME Agent
Understanding FLUME Configurations
Hadoop Ecosystem Labs
Importing data from MYSQL and querying it using HIVE
Configuring a FLUME agent to listen to local log files