Big data

LIVE

40 Hours

Course offered by Meghasyam

0 review

overview batches reviews

1. What is Big Data
 Introduction to Data Lake
 Why do companies care for Big Data
 What do we do out of Big Data
 Where did this data come from
 Social Media – A way to get the true customer insights
2. History of Hadoop
 Hadoop Timeline
 Why Hadoop
 Hadoop 1.X Architecture
 Hadoop 1.X Core Components
 Hadoop 1.X Job Process
3. Importance of HDFS
 HDFS Daemons
 Name Node
 Data Node
 Secondary Name Node
 Node Level Failure Handling in Hadoop 1.X
4. Different Phases in Map Reduce
 Input – Output formats in each phase
 Modeling Real World applications into Map Reduce
 Understanding Map Reduce Program Execution
 Problems in Map Reduce
5. Hadoop 1.X labs
 Setting up a Pseudo Mode Hadoop Cluster
 Executing a sample Map Reduce Program
 Writing and Understanding Basic Map Reduce Program
6. Apache HIVE
 Introduction to Hive Meta store
 SQL vs. Hive
 Hive Query language
 Managed and External tables
 Querying data
 Hive thrift server
 Working on HIVE Beeline
 Joins, Sub Queries and other Aggregations
7. Apache PIG
 Introduction to PIG
 Map Reduce vs. PIG
 PIG in local mode
 PIG in Map Reduce mode
 Local mode vs. Hadoop mode
 Execution mechanism and data processing
 Writing PIG scripts
 User defined functions in PIG
8. SQOOP
 Introduction to SQOOP framework
 SQOOP flavors of Import
 SQOOP flavors of Export
 SQOOP CLI Options
9. FLUME
 Introduction to Messaging Service
 Applications of a Messaging Service
 FLMUE Architecture Framework
 Working of a FLUME Agent
 Understanding FLUME Configurations
 Hadoop Ecosystem Labs
 Importing data from MYSQL and querying it using HIVE
 Configuring a FLUME agent to listen to local log files