Course description

  • What is this course about?

    The ‘Introduction to Big Data and Hadoop’ is an ideal course package for individuals who want to understand the basic concepts of Big Data and Hadoop. On completing this course, learners will be able to interpret what goes behind the processing of huge volumes of data as the industry switches over from excel-based analytics to real-time analytics.

    The course focuses on the basics of Big Data and Hadoop. It further provides an overview of the commercial distributions of Hadoop as well as the components of the Hadoop ecosystem. 

  • Why the course is most sought after?

    Big Data Analytics is widely used to analyze large volumes of data. The growing need for professionals equipped with the knowledge of Big Data and Hadoop has increased opportunities for those who want to make a career in this field. Knowing the basics of Big Data and Hadoop will make it easier for such professionals to pursue advanced level courses in this subject and acquire skills to become experts in Big Data analytics.
    Knowledge of Big Data and Hadoop enables you to install and configure Hadoop components and manage as well as integrate large sets of unstructured data. The following examples show why you should get equipped with the knowledge of Big Data and Hadoop.
    • Facebook, which is a $5.1 billion company, has over 1 billion active users! It is Hadoop that enables Facebook to manage data of such magnitude.
    • Linkedin manages over 1 billion personalized recommendations every week, with the help of Hadoop’s MapReduce and HDFS features.
    • The Yahoo! Search Webmap is a Hadoop application that runs on over 10,000 core Linux cluster and generates the data that is widely used in each query of Yahoo! Web search.

  • What learning benefits do you get from Simplilearn’s training?

    At the end of Simplilearn’s training in the basics of Big Data and Hadoop, the participants will:
    • Understand the characteristics of Big Data
    • Describe the basics of Hadoop and HDFS architecture
    • List the features and processes of MapReduce
    • Learn the basics of Pig, Hive and HBase
    • Explore the commercial distributions of Hadoop
    • Understand the key components of the Hadoop ecosystem
    • Get introduced to Sqoop & ZooKeeper

  • What are the career benefits in-store for you?

    A good understanding of the basics of Big Data and Hadoop makes it easier to improve your analytic skills, thus increasing your career prospects in the Big Data analytics industry. According to Robert Half Technology, the average salary for a Hadoop certified Professional is in the range of $154,250.

    Top companies like Microsoft, Software AG, IBM, Oracle, HP, SAP, EMC2 and Dell have invested a huge $15 billion on data management and analytics, thereby increasing the number of opportunities for Big data & Hadoop certified professionals.

  • Who should do this course?

    Simplilearn’s ‘Introduction to Big data and Hadoop’ course is meant for professionals who intend to gain a basic understanding of Big Data and Hadoop. It is ideal for professionals in senior management who requires a theoretical understanding of how Hadoop can solve their Big Data problem.

Course preview

    • Lesson 1.0 - Introduction to Big Data and Hadoop 29:37
      • 1 Introduction to Big Data and Hadoop00:21
      • 2 Objectives00:26
      • 3 Need for Big Data01:42
      • 4 Three Characteristics of Big Data00:35
      • 5 Characteristics of Big Data Technology01:52
      • 6 Appeal of Big Data Technology00:50
      • 7 Handling Limitations of Big Data00:49
      • 8 Introduction to Hadoop01:00
      • 9 Hadoop Configuration00:53
      • 10 Apache Hadoop Core Components00:36
      • 11 Hadoop Core Components—HDFS01:07
      • 12 Hadoop Core Components—MapReduce00:45
      • 13 HDFS Architecture01:13
      • 14 Ubuntu Server—Introduction00:51
      • 15 Hadoop Installation—Prerequisites00:26
      • 16 Hadoop Multi-Node Installation—Prerequisites00:29
      • 17 Single-Node Cluster vs. Multi-Node Cluster00:49
      • 18 MapReduce01:09
      • 19 Characteristics of MapReduce00:56
      • 20 Real-Time Uses of MapReduce01:01
      • 21 Prerequisites for Hadoop Installation in Ubuntu Desktop 12.0400:20
      • 22 Hadoop MapReduce—Features00:52
      • 23 Hadoop MapReduce—Processes00:48
      • 24 Advanced HDFS–Introduction00:47
      • 25 Advanced MapReduce00:55
      • 26 Data Types in Hadoop01:15
      • 27 Distributed Cache00:41
      • 28 Distributed Cache (contd.)00:40
      • 29 Joins in MapReduce00:44
      • 30 Introduction to Pig00:40
      • 31 Components of Pig01:00
      • 32 Data Model00:43
      • 33 Pig vs. SQL01:07
      • 34 Prerequisites to Set the Environment for Pig Latin00:20
      • 35 Summary00:55
    • Lesson 1.1 - Hive HBase and Hadoop Ecosystem Components 29:59
      • 1 Hive, HBase and Hadoop Ecosystem Components00:22
      • 2 Objectives00:23
      • 3 Hive—Introduction00:55
      • 4 Hive—Characteristics01:20
      • 5 System Architecture and Components of Hive00:18
      • 6 Basics of Hive Query Language00:38
      • 7 Data Model—Tables00:32
      • 8 Data Types in Hive00:16
      • 9 Serialization and De serialization01:19
      • 10 UDF/UDAF vs. MapReduce Scripts00:47
      • 11 HBase—Introduction01:15
      • 12 Characteristics of HBase00:42
      • 13 HBase Architecture01:04
      • 14 HBase vs. RDBMS01:08
      • 15 Cloudera—Introduction00:44
      • 16 Cloudera Distribution01:07
      • 17 Cloudera Manager00:34
      • 18 Hortonworks Data Platform00:42
      • 19 MapR Data Platform00:43
      • 20 Pivotal HD00:53
      • 21 Introduction to ZooKeeper00:23
      • 22 Features of ZooKeeper01:12
      • 23 Goals of ZooKeeper00:38
      • 24 Uses of ZooKeeper00:49
      • 25 Sqoop—Reasons to Use It01:26
      • 26 Sqoop—Reasons to Use It (contd.)01:09
      • 27 Benefits of Sqoop00:42
      • 28 Apache Hadoop Ecosystem00:59
      • 29 Apache Oozie00:43
      • 30 Introduction to Mahout00:22
      • 31 Usage of Mahout00:28
      • 32 Apache Cassandra00:54
      • 33 Apache Spark01:28
      • 34 Apache Ambari00:32
      • 35 Key Features of Apache Ambari00:51
      • 36 Hadoop Security—Kerberos00:53
      • 37 Summary00:48
    • Lesson 1.2 - Quiz
      • Quiz
    • Lesson 1.3 - Thank You 00:09
      • Thank You00:09
    • {{childObj.title}}
      • {{childObj.childSection.chapter_name}}
        • {{lesson.title}}
      • {{lesson.title}}

    View More

    View Less

Exam & certification

  • What qualifications do you need?

    There are no prerequisites for this course.

  • What do I need to do to unlock my Simplilearn certificate?

    • Complete 85% of the course.
    • Complete 1 simulation test with a minimum score of 60%.


    Cadence Serna
    Cadence Serna Customer Lifecycle Management at AT&T

    For an introduction, this is still very dense. There's a lot to take in; it's a very broad and detailed top-down look at big data. I'm very glad to have taken the time to view this intro.

    Read more Read less
    Shubham Das
    Shubham Das SCM Analyst at Tata Consultancy Services

    The course is very informative and detailed.

    Venkat Nagender
    Venkat Nagender Solution Architect @ Ericsson

    Very nice and easily understandable. All important topics are covered.


    • I want to know more about the training program. Whom do I contact?

      Please join our Live Chat for instant support, call us, or Request a Call Back to have your query resolved.

    • What does it mean to be GSA approved course?

      The course is part of Simplilearn’s contract with GSA (only US) with special pricing for GSA approved agencies & organizations. To know more click here

    • How do i know if I am eligible to buy this course at GSA price?

      You should be employed with GSA approved agencies & organizations. The list of approved agencies is provided here

    • Disclaimer
    • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.