Hadoop for Developers (4 days) Training Course

Primary tabs

Course Code

hadoopdev

Duration Duration

28 hours (usually 4 days including breaks)

Requirements Requirements

  • comfortable with Java programming language (most programming exercises are in java)
  • comfortable in Linux environment (be able to navigate Linux command line, edit files using vi / nano)

Lab environment

Zero Install : There is no need to install hadoop software on students’ machines! A working hadoop cluster will be provided for students.

Students will need the following

  • a SSH client (Linux and Mac already have ssh clients, for Windows Putty is recommended)
  • a browser to access the cluster. We recommend Firefox browser

Overview Overview

Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. This course will introduce a developer to various components (HDFS, MapReduce, Pig, Hive and HBase) Hadoop ecosystem.

 

    Course Outline Course Outline

    Section 1: Introduction to Hadoop

    • hadoop history, concepts
    • eco system
    • distributions
    • high level architecture
    • hadoop myths
    • hadoop challenges
    • hardware / software
    • lab : first look at Hadoop

    Section 2: HDFS

    • Design and architecture
    • concepts (horizontal scaling, replication, data locality, rack awareness)
    • Daemons : Namenode, Secondary namenode, Data node
    • communications / heart-beats
    • data integrity
    • read / write path
    • Namenode High Availability (HA), Federation
    • labs : Interacting with HDFS

    Section 3 : Map Reduce

    • concepts and architecture
    • daemons (MRV1) : jobtracker / tasktracker
    • phases : driver, mapper, shuffle/sort, reducer
    • Map Reduce Version 1 and Version 2 (YARN)
    • Internals of Map Reduce
    • Introduction to Java Map Reduce program
    • labs : Running a sample MapReduce program

    Section 4 : Pig

    • pig vs java map reduce
    • pig job flow
    • pig latin language
    • ETL with Pig
    • Transformations & Joins
    • User defined functions (UDF)
    • labs : writing Pig scripts to analyze data

    Section 5: Hive

    • architecture and design
    • data types
    • SQL support in Hive
    • Creating Hive tables and querying
    • partitions
    • joins
    • text processing
    • labs : various labs on processing data with Hive

    Section 6: HBase

    • concepts and architecture
    • hbase vs RDBMS vs cassandra
    • HBase Java API
    • Time series data on HBase
    • schema design
    • labs : Interacting with HBase using shell;   programming in HBase Java API ; Schema design exercise

    Guaranteed to run even with a single delegate!
    Public Classroom Public Classroom
    Participants from multiple organisations. Topics usually cannot be customised
    From $11260
    (3)
    Private Classroom Private Classroom
    Participants are from one organisation only. No external participants are allowed. Usually customised to a specific group, course topics are agreed between the client and the trainer.
    Private Remote Private Remote
    The instructor and the participants are in two different physical locations and communicate via the Internet
    From $8500
    Request quote

    The more delegates, the greater the savings per delegate. Table reflects price per delegate and is used for illustration purposes only, actual prices may differ.

    Number of Delegates Public Classroom Private Remote
    1 $11260 $8500
    2 $6305 $4875
    3 $4653 $3667
    4 $3828 $3063
    Cannot find a suitable date? Choose Your Course Date >>
    Too expensive? Suggest your price

    Related Categories


    Upcoming Courses

    VenueCourse DateCourse Price [Remote / Classroom]
    CO, Denver - Colorado Boulevard CenterMon, Apr 10 2017, 9:30 am$8500 / $11300
    AZ, Phoenix - 24th and CamelbackMon, Jul 3 2017, 9:30 am$8500 / $11260

    Course Discounts

    Course Discounts Newsletter

    We respect the privacy of your email address. We will not pass on or sell your address to others.
    You can always change your preferences or unsubscribe completely.

    Some of our clients