BIG DATA ANALYTICS WITH HADOOP. 40 Hour Course

Similar documents
Transcription:

1 BIG DATA ANALYTICS WITH HADOOP 40 Hour Course

OVERVIEW Learning Objectives Understanding Big Data Understanding various types of data that can be stored in Hadoop Setting up and Configuring Hadoop in Pseduo Distributed Mode Distributed Mode Understanding how Big Data & Hadoop fit in the current environment and infrastructure WWW.WISDOMSPROUTS.COM 2

Work with Map Reduce programs Code the Ecosystem projects Performing Data Analytics using PIG & HIVE Understand and work on real time use cases Implementing a Hadoop project Working on live/real life project on big data analytics using Hadoop eco-system And Much More.. WWW.WISDOMSPROUTS.COM 3

COURSE HIGHLIGHTS. Detailed explanation of every topic with real world example. Examples include both Industry based as well as based on day to day activities. Labs, Assignments and Test for every topic. Labs would be performed on Virtual Machine which will be shared with you. Project A real use case based project will be assigned with proper documentation and all required files. WWW.WISDOMSPROUTS.COM 4

COURSE FEES Simple fee structure. No hidden costs. 25,000 INR 20,000 INR WWW.WISDOMSPROUTS.COM 5

Support Anytime during the Course duration and even after the course completion, you can post your queries through mail on helpdesk@wisdomsprouts.com and we will get back to you within 24 hours WWW.WISDOMSPROUTS.COM 6

CONTENT PROVIDED A Hadoop Reference guide having all the topics covered in the sessions. Lab Manual with all Labs covered. Exercise Manual with some assignments. Question Bank. Government Recognised Certificate WWW.WISDOMSPROUTS.COM 7

8 MODULE 1 Introduction to Big Data

INTRODUCTION TO BIG DATA ANALYTICS Big Data what? It s characteristics. Some facts and figures Importance of Big Data Need of understanding and analyzing Big Data. Basics of Data Analytics Problems with existing systems. WWW.WISDOMSPROUTS.COM 9

10 MODULE 2 Introduction to Hadoop

INTRODUCTION TO HADOOP What is Hadoop Architecture Hadoop Job Process File Anatomy Read Operations Write Operations Useful Configurations core-site.xml hdfs-site.xml mapred-site.xml WWW.WISDOMSPROUTS.COM 11

HDFS ( HADOOP DISTRIBUTED FILE SYSTEM ) Significance of HDFS in Hadoop HDFS Features Daemons of Hadoop and functionalities NameNode DataNode JobTracker TaskTracker Secondary NameNode WWW.WISDOMSPROUTS.COM 12

Data Storage in HDFS Blocks Heartbeats Data Replication Accessing HDFS CLI (Command Line Interface) Unix and Hadoop Commands Java Based Approach WWW.WISDOMSPROUTS.COM 13

MAP REDUCE Introduction to MapReduce MapReduce Architecture MapReduce Programming Model MapReduce Algorithm and Phases Basic MapReduce Program Driver Code Mapper Code Reducer Code WWW.WISDOMSPROUTS.COM 14

LABS Configuring a pseudo distributed Hadoop Cluster. Working with HDFS command line options. Running a Word Count program. WWW.WISDOMSPROUTS.COM 15

16 MODULE 3 Hadoop Ecosystem

HADOOP ECOSYSTEM Hadoop Ecosystem What is ecosystem Different ecosystem projects Sqoop Hive Pig Flume Ambari Hue WWW.WISDOMSPROUTS.COM 17

Revision Test 25 questions 20 minutes WWW.WISDOMSPROUTS.COM 18

LABS Import data using Sqoop and query it using Hive. Configuring a Flume agent. Mini Project Using all Ecosystem projects on one sample weblogs data set WWW.WISDOMSPROUTS.COM 19

20 MODULE 4 Brief Walk with HDFS and MapReduce

DEEPER DIVE Advanced HDFS Secondary NameNode Federation High Availability Advanced MapReduce Demo of Precedence levels Partioners Combiners WWW.WISDOMSPROUTS.COM 21

22 MODULE 5 Hadoop Administration

HADOOP ADMINISTRATION Cluster Planning Understanding hardware and software requirements of a Hadoop cluster Different modes of operation of Hadoop Precedence levels Some dos and don ts WWW.WISDOMSPROUTS.COM 23

LABS Hive Creating different types of tables Executing different queries Pig Working with different data types Modes of execution Running some pig application driven commands Sqoop Import Export WWW.WISDOMSPROUTS.COM 24

25 MODULE 6 Data Visualization

DATA VISUALIZATION Working with Hadoop ODBC Connector. Data Visualization using Excel. Exporting Hive data Creating graphs and interactive charts for your hive data. Analysing hive data using power view in excel. WWW.WISDOMSPROUTS.COM 26

27 MODULE 7 Use Cases Project Work

PROJECT Twitter Use Case Why do companies use twitter data How to analyze twitter data How to Create Twitter API Load Some tweets into HDFS Query twitter logs. Revision Test 50 questions 30 minutes WWW.WISDOMSPROUTS.COM 28

29 HOPE TO SEE ON-BOARD SOON.