Hadoop & Spark Course Overview

The Big Data Hadoop Certification course is designed to give you an in-depth knowledge of the Big Data framework using Hadoop and Spark. In this hands-on Big Data course, you will execute real-life, industry-based projects using Integrated Lab.

Hadoop & Spark Course Key Features

  • 48 hours of instructor-led training
  • 10 hours of self-paced video
  • 4 real-life industry projects using Hadoop, Hive and Big data stack
  • Training on Yarn, MapReduce, Pig, Hive, HBase, and Apache Spark
  • Lifetime access to self-paced learning
  • Aligned to Cloudera CCA175 certification exam

Skills Covered

  • Real-time data processing
  • Functional programming
  • Spark applications
  • Parallel processing
  • Spark RDD optimization techniques
  • Spark SQL

Benefits

Upskilling in Big Data and Analytics field is a smart career decision. According to Allied Market Research, the global Hadoop market will reach $84.6 Billion by 2021 and there is a shortage of 1.4-1.9 million Hadoop data analysts in the U.S. alone. Here are a selection of Hadoop specialist opportunities in your area:
  • Designation
  • Annual Salary
  • Hiring Companies
  • Annual Salary
    $93KMin
    $124KAverage
    $165KMax
    Source: Glassdoor
    Hiring Companies
    Amazon
    Hewlett-Packard
    Wipro
    Cognizant
    Spotify
    Source: Indeed
  • Annual Salary
    $81KMin
    $117KAverage
    $160KMax
    Source: Glassdoor
    Hiring Companies
    Amazon
    Hewlett-Packard
    Facebook
    KPMG
    Verizon
    Source: Indeed
  • Annual Salary
    $58KMin
    $88.5KAverage
    $128KMax
    Source: Glassdoor
    Hiring Companies
    Cisco
    Target Corp
    GE
    IBM
    Source: Indeed

Training Options

Self-Paced Learning

A$ 1,049

    • Learn at your own pace
    • Get lifetime access to 10 hours of world-class on-demand video content
    • Work on 4 live projects using Hadoop Big data stack, and Hive
    • Learn 10+ Big Data tools for hands-on training
    • Get Simplilearn certificate upon course completion
    • 24x7 learner assistance and platform support

Blended Learning

A$ 1,199

  • Everything in Self-Paced Learning, plus
  • 90 days of flexible access to online classes
    • Learn in an instructor-led online training class
    • 48 hours of instructor-led training in a flexible class schedule
    • One to one mentorship for doubt resolution
  • Classes starting in Brisbane from:-
15th Dec: Weekday Class
21st Dec: Weekend Class

Corporate Training

Customized to your team's needs

    • Customized learning delivery model (self-paced and/or instructor-led)
    • Flexible pricing options
    • Enterprise grade learning management system (LMS)
    • Enterprise dashboards for individuals and teams
    • 24x7 learner assistance and support

Hadoop & Spark Course Curriculum

Eligibility

Big Data Hadoop training is best suited for IT, data management, and analytics professionals looking to gain expertise in Big Data, including: Software Developers and Architects, Analytics Professionals, Senior IT professionals, Testing and Mainframe Professionals, Data Management Professionals, Business Intelligence Professionals, Project Managers, Aspiring Data Scientists, Graduates looking to begin a career in Big Data Analytics
Read More

Pre-requisites

Professionals entering into Big Data Hadoop certification program should have a basic understanding of Core Java and SQL.
Read More

Course Content

  • Big Data Hadoop and Spark Developer

    Preview
    • Lesson 1 Course Introduction

      08:51Preview
      • 1.1 Course Introduction
        05:52
      • 1.2 Accessing Practice Lab
        02:59
    • Lesson 2 Introduction to Big Data and Hadoop

      43:59Preview
      • 1.1 Introduction to Big Data and Hadoop
        00:31
      • 1.2 Introduction to Big Data
        01:02
      • 1.3 Big Data Analytics
        04:24
      • 1.4 What is Big Data
        02:54
      • 1.5 Four Vs Of Big Data
        02:13
      • 1.6 Case Study Royal Bank of Scotland
        01:31
      • 1.7 Challenges of Traditional System
        03:38
      • 1.8 Distributed Systems
        01:55
      • 1.9 Introduction to Hadoop
        05:28
      • 1.10 Components of Hadoop Ecosystem Part One
        02:17
      • 1.11 Components of Hadoop Ecosystem Part Two
        02:53
      • 1.12 Components of Hadoop Ecosystem Part Three
        03:48
      • 1.13 Commercial Hadoop Distributions
        04:19
      • 1.14 Demo: Walkthrough of Simplilearn Cloudlab
        06:51
      • 1.15 Key Takeaways
        00:15
      • Knowledge Check
    • Lesson 3 Hadoop Architecture,Distributed Storage (HDFS) and YARN

      57:50Preview
      • 2.1 Hadoop Architecture Distributed Storage (HDFS) and YARN
        00:50
      • 2.2 What Is HDFS
        00:54
      • 2.3 Need for HDFS
        01:52
      • 2.4 Regular File System vs HDFS
        01:27
      • 2.5 Characteristics of HDFS
        03:24
      • 2.6 HDFS Architecture and Components
        02:30
      • 2.7 High Availability Cluster Implementations
        04:47
      • 2.8 HDFS Component File System Namespace
        02:40
      • 2.9 Data Block Split
        02:32
      • 2.10 Data Replication Topology
        01:16
      • 2.11 HDFS Command Line
        02:14
      • 2.12 Demo: Common HDFS Commands
        04:39
      • HDFS Command Line
      • 2.13 YARN Introduction
        01:32
      • 2.14 YARN Use Case
        02:21
      • 2.15 YARN and Its Architecture
        02:09
      • 2.16 Resource Manager
        02:14
      • 2.17 How Resource Manager Operates
        02:28
      • 2.18 Application Master
        03:29
      • 2.19 How YARN Runs an Application
        04:39
      • 2.20 Tools for YARN Developers
        01:38
      • 2.21 Demo: Walkthrough of Cluster Part One
        03:06
      • 2.22 Demo: Walkthrough of Cluster Part Two
        04:35
      • 2.23 Key Takeaways
        00:34
      • Knowledge Check
      • Hadoop Architecture,Distributed Storage (HDFS) and YARN
    • Lesson 4 Data Ingestion into Big Data Systems and ETL

      01:05:21Preview
      • 3.1 Data Ingestion into Big Data Systems and ETL
        00:42
      • 3.2 Data Ingestion Overview Part One
        01:51
      • 3.3 Data Ingestion Overview Part Two
        01:41
      • 3.4 Apache Sqoop
        02:04
      • 3.5 Sqoop and Its Uses
        03:02
      • 3.6 Sqoop Processing
        02:11
      • 3.7 Sqoop Import Process
        02:24
      • 3.8 Sqoop Connectors
        04:22
      • 3.9 Demo: Importing and Exporting Data from MySQL to HDFS
        05:07
      • Apache Sqoop
      • 3.9 Apache Flume
        02:42
      • 3.10 Flume Model
        01:56
      • 3.11 Scalability in Flume
        01:33
      • 3.12 Components in Flume’s Architecture
        02:40
      • 3.13 Configuring Flume Components
        01:58
      • 3.15 Demo: Ingest Twitter Data
        04:43
      • 3.14 Apache Kafka
        01:54
      • 3.15 Aggregating User Activity Using Kafka
        01:34
      • 3.16 Kafka Data Model
        02:56
      • 3.17 Partitions
        02:04
      • 3.18 Apache Kafka Architecture
        03:02
      • 3.21 Demo: Setup Kafka Cluster
        03:52
      • 3.19 Producer Side API Example
        02:30
      • 3.20 Consumer Side API
        00:43
      • 3.21 Consumer Side API Example
        02:36
      • 3.22 Kafka Connect
        01:14
      • 3.26 Demo: Creating Sample Kafka Data Pipeline using Producer and Consumer
        03:35
      • 3.23 Key Takeaways
        00:25
      • Knowledge Check
      • Data Ingestion into Big Data Systems and ETL
    • Lesson 5 Distributed Processing - MapReduce Framework and Pig

      01:01:09Preview
      • 4.1 Distributed Processing MapReduce Framework and Pig
        00:44
      • 4.2 Distributed Processing in MapReduce
        03:01
      • 4.3 Word Count Example
        02:09
      • 4.4 Map Execution Phases
        01:48
      • 4.5 Map Execution Distributed Two Node Environment
        02:10
      • 4.6 MapReduce Jobs
        01:55
      • 4.7 Hadoop MapReduce Job Work Interaction
        02:24
      • 4.8 Setting Up the Environment for MapReduce Development
        02:57
      • 4.9 Set of Classes
        02:09
      • 4.10 Creating a New Project
        02:25
      • 4.11 Advanced MapReduce
        01:30
      • 4.12 Data Types in Hadoop
        02:22
      • 4.13 OutputFormats in MapReduce
        02:25
      • 4.14 Using Distributed Cache
        01:51
      • 4.15 Joins in MapReduce
        03:07
      • 4.16 Replicated Join
        02:37
      • 4.17 Introduction to Pig
        02:03
      • 4.18 Components of Pig
        02:08
      • 4.19 Pig Data Model
        02:23
      • 4.20 Pig Interactive Modes
        03:18
      • 4.21 Pig Operations
        01:19
      • 4.22 Various Relations Performed by Developers
        03:06
      • 4.23 Demo: Analyzing Web Log Data Using MapReduce
        05:43
      • 4.24 Demo: Analyzing Sales Data and Solving KPIs using PIG
        02:46
      • Apache Pig
      • 4.25 Demo: Wordcount
        02:21
      • 4.23 Key takeaways
        00:28
      • Knowledge Check
      • Distributed Processing - MapReduce Framework and Pig
    • Lesson 6 Apache Hive

      59:47Preview
      • 5.1 Apache Hive
        00:37
      • 5.2 Hive SQL over Hadoop MapReduce
        01:38
      • 5.3 Hive Architecture
        02:41
      • 5.4 Interfaces to Run Hive Queries
        01:47
      • 5.5 Running Beeline from Command Line
        01:51
      • 5.6 Hive Metastore
        02:58
      • 5.7 Hive DDL and DML
        02:00
      • 5.8 Creating New Table
        03:15
      • 5.9 Data Types
        01:37
      • 5.10 Validation of Data
        02:41
      • 5.11 File Format Types
        02:40
      • 5.12 Data Serialization
        02:35
      • 5.13 Hive Table and Avro Schema
        02:38
      • 5.14 Hive Optimization Partitioning Bucketing and Sampling
        01:28
      • 5.15 Non Partitioned Table
        01:58
      • 5.16 Data Insertion
        02:22
      • 5.17 Dynamic Partitioning in Hive
        02:43
      • 5.18 Bucketing
        01:44
      • 5.19 What Do Buckets Do
        02:04
      • 5.20 Hive Analytics UDF and UDAF
        03:11
      • 5.21 Other Functions of Hive
        03:17
      • 5.22 Demo: Real-Time Analysis and Data Filteration
        03:18
      • 5.23 Demo: Real-World Problem
        04:30
      • 5.24 Demo: Data Representation and Import using Hive
        03:52
      • 5.25 Key Takeaways
        00:22
      • Knowledge Check
      • Apache Hive
    • Lesson 7 NoSQL Databases - HBase

      21:41Preview
      • 6.1 NoSQL Databases HBase
        00:33
      • 6.2 NoSQL Introduction
        04:42
      • Demo: Yarn Tuning
        03:28
      • 6.3 HBase Overview
        02:53
      • 6.4 HBase Architecture
        04:43
      • 6.5 Data Model
        03:11
      • 6.6 Connecting to HBase
        01:56
      • HBase Shell
      • 6.7 Key Takeaways
        00:15
      • Knowledge Check
      • NoSQL Databases - HBase
    • Lesson 8 Basics of Functional Programming and Scala

      48:00Preview
      • 7.1 Basics of Functional Programming and Scala
        00:39
      • 7.2 Introduction to Scala
        02:59
      • 7.3 Demo: Scala Installation
        02:54
      • 7.3 Functional Programming
        03:08
      • 7.4 Programming with Scala
        04:01
      • Demo: Basic Literals and Arithmetic Operators
        02:57
      • Demo: Logical Operators
        01:21
      • 7.5 Type Inference Classes Objects and Functions in Scala
        04:45
      • Demo: Type Inference Functions Anonymous Function and Class
        05:04
      • 7.6 Collections
        01:33
      • 7.7 Types of Collections
        05:37
      • Demo: Five Types of Collections
        03:42
      • Demo: Operations on List
        03:16
      • 7.8 Scala REPL
        02:27
      • Demo: Features of Scala REPL
        03:17
      • 7.9 Key Takeaways
        00:20
      • Knowledge Check
      • Basics of Functional Programming and Scala
    • Lesson 9 Apache Spark Next Generation Big Data Framework

      36:54Preview
      • 8.1 Apache Spark Next Generation Big Data Framework
        00:43
      • 8.2 History of Spark
        01:58
      • 8.3 Limitations of MapReduce in Hadoop
        02:48
      • 8.4 Introduction to Apache Spark
        01:11
      • 8.5 Components of Spark
        03:10
      • 8.6 Application of In-Memory Processing
        02:54
      • 8.7 Hadoop Ecosystem vs Spark
        01:30
      • 8.8 Advantages of Spark
        03:22
      • 8.9 Spark Architecture
        03:42
      • 8.10 Spark Cluster in Real World
        02:52
      • 8.11 Demo: Running a Scala Programs in Spark Shell
        03:45
      • 8.12 Demo: Setting Up Execution Environment in IDE
        04:18
      • 8.13 Demo: Spark Web UI
        04:14
      • 8.11 Key Takeaways
        00:27
      • Knowledge Check
      • Apache Spark Next Generation Big Data Framework
    • Lesson 10 Spark Core Processing RDD

      01:16:31Preview
      • 9.1 Processing RDD
        00:37
      • 9.1 Introduction to Spark RDD
        02:35
      • 9.2 RDD in Spark
        02:18
      • 9.3 Creating Spark RDD
        05:48
      • 9.4 Pair RDD
        01:53
      • 9.5 RDD Operations
        03:20
      • 9.6 Demo: Spark Transformation Detailed Exploration Using Scala Examples
        03:13
      • 9.7 Demo: Spark Action Detailed Exploration Using Scala
        03:32
      • 9.8 Caching and Persistence
        02:41
      • 9.9 Storage Levels
        03:31
      • 9.10 Lineage and DAG
        02:11
      • 9.11 Need for DAG
        02:51
      • 9.12 Debugging in Spark
        01:11
      • 9.13 Partitioning in Spark
        04:05
      • 9.14 Scheduling in Spark
        03:28
      • 9.15 Shuffling in Spark
        02:41
      • 9.16 Sort Shuffle
        03:18
      • 9.17 Aggregating Data with Pair RDD
        01:33
      • 9.18 Demo: Spark Application with Data Written Back to HDFS and Spark UI
        09:08
      • 9.19 Demo: Changing Spark Application Parameters
        06:27
      • 9.20 Demo: Handling Different File Formats
        02:51
      • 9.21 Demo: Spark RDD with Real-World Application
        04:03
      • 9.22 Demo: Optimizing Spark Jobs
        02:56
      • 9.23 Key Takeaways
        00:20
      • Knowledge Check
      • Spark Core Processing RDD
    • Lesson 11 Spark SQL - Processing DataFrames

      29:08Preview
      • 10.1 Spark SQL Processing DataFrames
        00:32
      • 10.2 Spark SQL Introduction
        02:13
      • 10.3 Spark SQL Architecture
        01:25
      • 10.4 DataFrames
        05:21
      • 10.5 Demo: Handling Various Data Formats
        03:21
      • 10.6 Demo: Implement Various DataFrame Operations
        03:20
      • 10.7 Demo: UDF and UDAF
        02:50
      • 10.8 Interoperating with RDDs
        04:45
      • 10.9 Demo: Process DataFrame Using SQL Query
        02:30
      • 10.10 RDD vs DataFrame vs Dataset
        02:34
      • Processing DataFrames
      • 10.11 Key Takeaways
        00:17
      • Knowledge Check
      • Spark SQL - Processing DataFrames
    • Lesson 12 Spark MLLib - Modelling BigData with Spark

      34:04Preview
      • 11.1 Spark MLlib Modeling Big Data with Spark
        00:38
      • 11.2 Role of Data Scientist and Data Analyst in Big Data
        02:12
      • 11.3 Analytics in Spark
        03:37
      • 11.4 Machine Learning
        03:27
      • 11.5 Supervised Learning
        02:19
      • 11.6 Demo: Classification of Linear SVM
        03:47
      • 11.7 Demo: Linear Regression with Real World Case Studies
        03:41
      • 11.8 Unsupervised Learning
        01:16
      • 11.9 Demo: Unsupervised Clustering K-Means
        02:45
      • 11.10 Reinforcement Learning
        02:02
      • 11.11 Semi-Supervised Learning
        01:17
      • 11.12 Overview of MLlib
        02:59
      • 11.13 MLlib Pipelines
        03:42
      • 11.14 Key Takeaways
        00:22
      • Knowledge Check
      • Spark MLLib - Modeling BigData with Spark
    • Lesson 13 Stream Processing Frameworks and Spark Streaming

      01:13:16Preview
      • 12.1 Stream Processing Frameworks and Spark Streaming
        00:34
      • 12.1 Streaming Overview
        01:41
      • 12.2 Real-Time Processing of Big Data
        02:45
      • 12.3 Data Processing Architectures
        04:12
      • 12.4 Demo: Real-Time Data Processing
        02:28
      • 12.5 Spark Streaming
        04:21
      • 12.6 Demo: Writing Spark Streaming Application
        03:15
      • 12.7 Introduction to DStreams
        01:52
      • 12.8 Transformations on DStreams
        03:44
      • 12.9 Design Patterns for Using ForeachRDD
        03:25
      • 12.10 State Operations
        00:46
      • 12.11 Windowing Operations
        03:16
      • 12.12 Join Operations stream-dataset Join
        02:13
      • 12.13 Demo: Windowing of Real-Time Data Processing
        02:32
      • 12.14 Streaming Sources
        01:56
      • 12.15 Demo: Processing Twitter Streaming Data
        03:56
      • 12.16 Structured Spark Streaming
        03:54
      • 12.17 Use Case Banking Transactions
        02:29
      • 12.18 Structured Streaming Architecture Model and Its Components
        04:01
      • 12.19 Output Sinks
        00:49
      • 12.20 Structured Streaming APIs
        03:36
      • 12.21 Constructing Columns in Structured Streaming
        03:07
      • 12.22 Windowed Operations on Event-Time
        03:36
      • 12.23 Use Cases
        01:24
      • 12.24 Demo: Streaming Pipeline
        07:07
      • Spark Streaming
      • 12.25 Key Takeaways
        00:17
      • Knowledge Check
      • Stream Processing Frameworks and Spark Streaming
    • Lesson 14 Spark GraphX

      28:43Preview
      • 13.1 Spark GraphX
        00:35
      • 13.2 Introduction to Graph
        02:38
      • 13.3 Graphx in Spark
        02:41
      • 13.4 Graph Operators
        03:29
      • 13.5 Join Operators
        03:18
      • 13.6 Graph Parallel System
        01:33
      • 13.7 Algorithms in Spark
        03:26
      • 13.8 Pregel API
        02:31
      • 13.9 Use Case of GraphX
        01:02
      • 13.10 Demo: GraphX Vertex Predicate
        02:23
      • 13.11 Demo: Page Rank Algorithm
        02:33
      • 13.12 Key Takeaways
        00:17
      • Knowledge Check
      • Spark GraphX
      • 13.14 Project Assistance
        02:17
    • Practice Projects

      • Car Insurance Analysis
      • Transactional Data Analysis
  • Free Course
  • Core Java

    Preview
    • Lesson 01 - Java Introduction

      01:18:27Preview
      • 1.1 Introduction to Java
        25:37
      • 1.2 Features of Java8
        11:41
      • 1.3 Object Oriented Programming (OOP)
        23:00
      • 1.4 Fundamentals of Java
        18:09
      • Quiz
    • Lesson 02 - Working with Java Variables

      36:00
      • 2.1 Declaring and Initializing Variables
        11:47
      • 2.2 Primitive Data Types
        06:50
      • 2.3 Read and Write Java Object Fields
        10:27
      • 2.4 Object Lifecycle
        06:56
      • Quiz
    • Lesson 03 - Java Operators and Decision Constructs

      15:01
      • 3.1 Java Operators and Decision Constructs
        15:01
      • Quiz
    • Lesson 04 - Using Loop Constructs in Java

      17:42
      • 4.1 Using Loop Constructs in Java
        17:42
      • Quiz
    • Lesson 05 - Creating and Using Array

      36:16
      • 5.1 Creating and Using One-dimensional Array
        26:53
      • 5.2 Creating and Using Multi-dimensional Array
        09:23
      • Quiz
    • Lesson 06 - Methods and Encapsulation

      35:55Preview
      • 6.1 Java Method
        04:36
      • 6.2 Static and Final Keyword
        15:16
      • 6.3 Constructors and Access Modifiers in Java
        07:04
      • 6.4 Encapsulation
        08:59
      • Quiz
    • Lesson 07 - Inheritance

      40:32Preview
      • 7.1 Polymorphism Casting and Super
        23:46
      • 7.2 Abstract Class and Interfaces
        16:46
      • Quiz
    • Lesson 08 - Exception Handling

      35:58Preview
      • 8.1 Types of Exceptions and Try-catch Statement
        18:48
      • 8.2 Throws Statement and Finally Block
        11:27
      • 8.3 Exception Classes
        05:43
      • Quiz
    • Lesson 09 - Work with Selected classes from the Java API

      01:01:06
      • 9.1 String
        28:16
      • 9.2 Working with StringBuffer
        05:44
      • 9.3 Create and Manipulate Calendar Data
        13:03
      • 9.4 Declare and Use of Arraylist
        14:03
      • Quiz
    • Lesson 10 - Additional Topics

      45:03
      • 10.1 Inner classes Inner Interfaces and Thread
        16:51
      • 10.2 Collection Framework
        05:05
      • 10.3 Comparable Comparator and Iterator
        10:19
      • 10.4 File Handling and Serialization
        12:48
      • Quiz
    • Lesson 11 - JDBC

      47:54Preview
      • 11.1 JDBC and its Architecture
        08:50
      • 11.2 Drivers in JDBC
        03:09
      • 11.3 JDBC API and Examples
        24:44
      • 11.4 Transaction Management in JDBC
        11:11
      • Quiz
    • Lesson 12 - Miscellaneous and Unit Testing

      19:24
      • 12.1 Unit Testing
        19:24
      • Quiz
    • Lesson 13 - Introduction to Java 8

      18:53
      • 13.1 Introduction to Java 8
        18:53
      • Quiz
    • Lesson 14 - Lambda Expression

      14:39
      • 14.1 Lambda Expression
        14:39
      • Quiz
  • Free Course
  • Linux Training

    Preview
    • Lesson 1 - Installing Linux

      35:26Preview
      • 1.1 The Course Overview
        06:31
      • 1.2 Introducing Concepts of Virtualization
        05:47
      • 1.3 Installing CentOS 7 in Virtualbox
        09:16
      • 1.4 How to work with Virtualbox
        05:11
      • 1.5 Connect to Your VM Through SSH
        08:41
    • Lesson 2 - Getting To Know The Command Line

      01:34:32Preview
      • 2.1 Working with Commands
        08:54
      • 2.2 File Globbing
        07:28
      • 2.3 Quoting Commands
        05:06
      • 2.4 Getting Help in the Command Line
        10:09
      • 2.5 Working in the Shell Efficiently
        09:53
      • 2.6 Streams, Redirects, and Pipes
        10:56
      • 2.7 Regular Expressions and grep
        09:17
      • 2.8 The sed Command
        07:01
      • 2.9 The Awk Command
        09:54
      • 2.10 Navigating the Linux Filesystem
        15:54
    • Lesson 3 - It's All About The Files

      01:14:59Preview
      • 3.1 Working with Files
        06:06
      • 3.2 How to Work with File Links
        04:42
      • 3.3 Searching for Files
        10:00
      • 3.4 Working with Users and Groups
        14:23
      • 3.5 Working with File Permissions
        19:23
      • 3.6 Working and Viewing Text Files in Linux
        06:17
      • 3.7 The VIM Text Editor
        14:08
    • Lesson 4 - Working With Command Line

      57:01Preview
      • 4.1 Essential Linux Commands
        08:36
      • 4.2 Additional Linux Programs
        10:26
      • 4.3 Processes
        07:54
      • 4.4 Signals
        04:57
      • 4.5 How to Work with Bash Shell Variables
        07:52
      • 4.6 Introduction to Bash Shell Scripting
        05:11
      • 4.7 Introduction to Bash Shell Scripting 2
        08:09
      • 4.8 How to Automate Script Execution
        03:56
    • Lesson 5 - More Advanced Command Line And Concepts

      01:06:42
      • 5.1 Basic Networking Concepts
        11:21
      • 5.2 Basic Networking Concepts 2
        16:15
      • 5.3 Install New Software and Update the System
        07:43
      • 5.4 Introduction to Services
        05:58
      • 5.5 Basic System Troubleshooting and Firewalling
        10:10
      • 5.6 Introducing ACL
        03:04
      • 5.7 Setuid, Setgid, and Sticky Bit
        12:11
  • Free Course
  • Simplifying data pipelines with Apache Kafka

    Preview
    • Lesson 1 About This Course

      • Learning Objectives
    • Lesson 2- Introduction to Apache Kafka

      22:50Preview
      • Learning Objectives
      • Introduction to Apache Kafka - Part A
        05:12
      • Introduction to Apache Kafka - Part B
        05:55
      • Introduction to Apache Kafka - Part C
        07:13
      • Hands-on Lab 1 Documentation
      • Introduction to Apache Kafka - Lab Solution
        04:30
    • Lesson 3- Kafka Command Line

      22:33Preview
      • Learning Objectives
      • Kafka Command Line - Part A
        05:04
      • Kafka Command Line - Part B
        06:19
      • Hands-on Lab 2 Documentation
      • Kafka Command Line - Lab Solution
        11:10
    • Lesson 4- Kafka Producer Java API

      19:50Preview
      • Learning Objectives
      • Kafka Producer Java API - Part A
        06:18
      • Kafka Producer Java API - Part B
        06:16
      • Hands-on Lab 3 Documentation
      • Kafka Producer Java API - Lab Solution
        07:16
    • Lesson 5- Kafka Consumer Java API

      26:17
      • Learning Objectives
      • Kafka Consumer Java API - Part A
        06:59
      • Kafka Consumer Java API - Part B
        06:59
      • Hands-on Lab 4 Documentation
      • Kafka Consumer Java API - Lab Solution
        12:19
    • Lesson 6- Kafka Connect and Spark Streaming

      27:44
      • Learning Objectives
      • Kafka Connect and Spark Streaming - Part A
        07:31
      • Kafka Connect and Spark Streaming - Part B
        06:59
      • Hands-on Lab 5 Documentation
      • Hands-on Lab 5 Solutions
        13:14
      • Unlocking IBM Certificate

Industry Project

  • Project 1

    Analyzing Historical Insurance claims

    Use Hadoop features to predict patterns and share actionable insights for a car insurance company.

  • Project 2

    Analyzing Intraday price changes

    Use Hive features for data engineering and analysis of New York stock exchange data.

  • Project 3

    Analyzing employee sentiment

    Perform sentiment analysis on employee review data gathered from Google, Netflix, and Facebook.

  • Project 4

    Analyzing Product performance

    Perform product and customer segmentation to increase the sales of Amazon.

prevNext

Hadoop & Spark Course Advisor

  • Ronald van Loon

    Ronald van Loon

    Top 10 Big Data and Data Science Influencer, Director - Adversitement

    Named by Onalytica as one of the three most influential people in Big Data, Ronald is also an author of a number of leading Big Data and Data Science websites, including Datafloq, Data Science Central, and The Guardian. He also regularly speaks at renowned events.

prevNext

Big Data Hadoop Exam & Certification

Big Data Hadoop Training in Brisbane, Australia
  • How will you get Simplilearn's Big Data Course Completion Certificate?

    Online Classroom:

    • Attend one complete batch
    • Complete one project and one simulation test with a minimum score of 80%

    Online Self-Learning:

    • Complete 85% of the course
    • Complete one project and one simulation test with a minimum score of 80%

  • How can i become Cloudera Certified Hadoop Developer?

    To become a Cloudera Certified Hadoop Developer, you must fulfill both of the following criteria:

    • Successfully complete Cloudera Hadoop Developer Certification Training, Provided by Simplilearn, which will be evaluated by the lead trainer. 
    • Pass Spark and Hadoop Developer Exam with a minimum score of 70% which costs your pocket by USD $ 295. The test is an online exam and consists of 8–12 performance-based (hands-on) tasks on Cloudera Enterprise cluster that must be answered within 120 minutes

  • What are the pre-requisites for this Hadoop Training Course in Brisbane?

    There are no prerequisites for learning this course. However, knowledge of Core Java and SQL will be beneficial, but certainly not a mandate. If you wish to brush up your Core-Java skills, Simplilearn offers a complimentary self-paced course "Java essentials for Hadoop" when you enroll for this course. For Spark, this course uses Python and Scala, and an e-book is provided to support your learning.
     

  • What are the prerequisites for learning Big Data Hadoop?

    There are no prerequisites for learning this course. However, knowledge of Core Java and SQL will be beneficial, but certainly not a mandate. If you wish to brush up your Core-Java skills, Simplilearn offers a complimentary self-paced course "Java essentials for Hadoop" when you enroll for this course. For Spark, this course uses Python and Scala, and an e-book is provided to support your learning.
     

  • How do I become a Big Data Hadoop Architect taking this Developer course?

    Those who are proficient in core Java and SQL technologies can take the Big Data Hadoop certification course offered by Simplilearn to become a Big Data Hadoop Architect.

  • Who provides the certification?

    Upon successful completion of the Big Data Hadoop certification training, you will be awarded the course completion certificate from Simplilearn.

  • Is this course accredited?

    No, this course is not officially accredited.

  • How do I pass Simplilearn's Big Data Hadoop Course Certification exam?

    • Online Classroom: attend one complete batch and complete one project and one simulation test with a minimum score of 80%
    • Online Self-learning: complete 85% of the course and complete one project and one simulation test with a minimum score of 80%

  • How long does it to take to complete the Big Data Hadoop certification course exam?

    It will take about 45-50 hours to complete the Big Data Hadoop course certification successfully.

  • How many attempts do I have to pass the Big Data Hadoop certification exam offered through Cloudera?

    While Simplilearn provides guidance and support to help learners pass the exam in the first attempt, if you do fail, you have a maximum of three re-takes to successfully pass. 

  • How long does it take to receive the Big Data Hadoop Certification Training Course Certificate from simplilearn?

    Upon completion of the Big Data Hadoop course, you will receive the Big Data Hadoop certificate immediately.

  • How long is the Big Data Hadoop course certificate from Simplilearn valid for?

    The Big Data Hadoop course certification from Simplilearn has lifelong validity.

  • If I fail the Simplilearn's Certification Exam after completing Big Data Hadoop Certification Course, how soon can I retake it?

    You can re-take it immediately.

  • If I pass the Simplilearn's Big Data Hadoop certification course exam , when and how do I receive my certificate?

    Upon successful completion of the course, you will receive the certificate through our Learning Management System which you can download or share via email or Linkedin.

  • Do you offer a money back guarantee for the training course?

    Yes. We do offer a money back guarantee for many of our training programs. Refer to our Refund Policy and submit refund requests via our Help and Support portal.

  • Do you provide any practice tests as part of this course?

    Yes, we provide 1 practice test as part of our course to help you prepare for the actual certification exam. You can try this free Big Data and Hadoop Developer Practice Test to understand the type of tests that are part of the course curriculum.

Hadoop & Spark Course Reviews

  • Pearl Lee

    Pearl Lee

    Service Manager at United Overseas Bank (Malaysia) Berhad, Melbourne

    Interactive training, good pace. Technical concepts were made easier for me to understand (I don't have much technical background). Trainer is very sincere in helping us learn and grasp the lessons, appreciate it a lot.

  • Olga Barrett

    Olga Barrett

    Career Advisor @ CV Wizard of OZ, Perth

    The study material is appropriately well organized, easy to follow...I am really impressed with the online teaching methodology followed during the training.

  • Satheesh Shivaswamy

    Satheesh Shivaswamy

    Analyst, Sydney

    Very good introduction to Big data Hadoop. Clearly organized and even a non-technical person can go through the course in a very organized manner.

  • Indu Neelakandan

    Indu Neelakandan

    Oracle Development DBA at Commonwealth Bank of Australia, Sydney

    The course was amazing. The trainer had a very good knowledge about Hadoop and Spark. He answered our questions patiently and his demos were also very helpful. These online classes made it easier to understand the concepts of big data. Thank you Simplilearn!

  • Solomon Larbi Opoku

    Solomon Larbi Opoku

    Senior Desktop Support Technician, Washington

    Content looks comprehensive and meets industry and market demand. The combination of theory and practical training is amazing.

  • Navin Ranjan

    Navin Ranjan

    Assistant Consultant, Gaithersburg

    Faculty is very good and explains all the things very clearly. Big data is totally new to me so I am not able to understand a few things but after listening to recordings I get most of the things.

  • Joan Schnyder

    Joan Schnyder

    Business, Systems Technical Analyst and Data Scientist, New York City

    The pace is perfect! Also, trainer is doing a great job of answering pertinent questions and not unrelated or advanced questions.

  • Ludovick Jacob

    Ludovick Jacob

    Manager of Enterprise Database Engineering & Support at USAC, Washington

    I really like the content of the course and the way trainer relates it with real-life examples.

  • Puviarasan Sivanantham

    Puviarasan Sivanantham

    Data Engineer at Fanatics, Inc., Sunnyvale

    Dedication of the trainer towards answering each & every question of the trainees makes us feel great and the online session as real as a classroom session.

  • Richard Kershner

    Richard Kershner

    Software Developer, Colorado Springs

    The trainer was knowledgeable and patient in explaining things. Many things were significantly easier to grasp with a live interactive instructor. I also like that he went out of his way to send additional information and solutions after the class via email.

  • Aaron Whigham

    Aaron Whigham

    Business Analyst at CNA Surety, Chicago

    Very knowledgeable trainer, appreciate the time slot as well… Loved everything so far. I am very excited…

  • Rudolf Schier

    Rudolf Schier

    Java Software Engineer at DAT Solutions, Portland

    Great approach for the core understanding of Hadoop. Concepts are repeated from different points of view, responding to audience. At the end of the class you understand it.

  • Kinshuk Srivastava

    Kinshuk Srivastava

    Data Scientist at Walmart, Little Rock

    The course is very informative and interactive and that is the best part of this training.

  • Priyanka Garg

    Priyanka Garg

    Sr. Consultant, Detroit

    Very informative and active sessions. Trainer is easy going and very interactive.

  • Peter Dao

    Peter Dao

    Senior Technical Analyst at Sutter Health, Sacramento

    The content is well designed and the instructor was excellent.

prevNext

Why Simplilearn

Simplilearn’s Blended Learning model brings classroom learningexperience online with its world-class LMS. It combines instructor-led training, self-paced learning and personalized mentoring to provide an immersive learning experience

  • Self-Paced Online Video

    A 360-degree learning approach that you can adapt to your learning style

  • Live Virtual Classroom

    Engage and learn more with these live and highly-interactive classes alongside your peers

  • 24/7 Teaching Assistance

    Keep engaged with integrated teaching assistance in your desktop and mobile learning

  • Online Practice Labs

    Projects provide you with sample work to show prospective employers

  • Applied Projects

    Real-world projects relevant to what you’re learning throughout the program

  • Learner Social Forums

    A support team focused on helping you succeed alongside a peer community

prevNext

Hadoop & Spark Training FAQs

  • Why Learn Big Data Hadoop?

    The world is getting increasingly digital, and this means big data is here to stay. In fact, the importance of big data and data analytics is going to continue growing in the coming years. Choosing a career in the field of big data and analytics might just be the type of role that you have been trying to find to meet your career expectations. Professionals who are working in this field can expect an impressive salary, with the median salary for data scientists being $116,000. Even those who are at the entry level will find high salaries, with average earnings of $92,000. As more and more companies realize the need for specialists in big data and analytics, the number of these jobs will continue to grow. Close to 80% of data scientists say there is currently a shortage of professionals working in the field.

    hadoop training

  • What are the objectives of this Course?

    The Big Data Hadoop Certification course is designed to give you an in-depth knowledge of the Big Data framework using Hadoop and Spark, including HDFS, YARN, and MapReduce. You will learn to use Pig, Hive, and Impala to process and analyze large datasets stored in the HDFS, and use Sqoop and Flume for data ingestion with our big data training.

    You will master real-time data processing using Spark, including functional programming in Spark, implementing Spark applications, understanding parallel processing in Spark, and using Spark RDD optimization techniques. With our big data course, you will also learn the various interactive algorithms in Spark and use Spark SQL for creating, transforming, and querying data forms.

    As a part of the Big Data course, you will be required to execute real-life, industry-based projects using CloudLab in the domains of banking, telecommunication, social media, insurance, and e-commerce.  This Big Data Hadoop training course will prepare you for the Cloudera CCA175 big data certification.

  • What skills will you learn in the Big Data Hadoop Certification?

    Big Data Hadoop training will enable you to master the concepts of the Hadoop framework and its deployment in a cluster environment. You will learn to:

    • Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark with this Hadoop course.
    • Understand Hadoop Distributed File System (HDFS) and YARN architecture, and learn how to work with them for storage and resource management
    • Understand MapReduce and its characteristics and assimilate advanced MapReduce concepts
    • Ingest data using Sqoop and Flume
    • Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
    • Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
    • Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
    • Understand and work with HBase, its architecture and data storage, and learn the difference between HBase and RDBMS
    • Gain a working knowledge of Pig and its components
    • Do functional programming in Spark, and implement and build Spark applications
    • Understand resilient distribution datasets (RDD) in detail
    • Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
    • Understand the common use cases of Spark and various interactive algorithms
    • Learn Spark SQL, creating, transforming, and querying data frames
    • Prepare for Cloudera CCA175 Big Data certification
       

  • Who should take this Big Data Hadoop Course in Brisbane?

    Big Data career opportunities are on the rise, and Hadoop is quickly becoming a must-know technology in Big Data architecture. Big Data training is best suited for IT, data management, and analytics professionals looking to gain expertise in Big Data, including:

    • Software Developers and Architects
    • Analytics Professionals
    • Senior IT professionals
    • Testing and Mainframe Professionals
    • Data Management Professionals
    • Business Intelligence Professionals
    • Project Managers
    • Aspiring Data Scientists
    • Graduates looking to build a career in Big Data Analytics

  • What projects you will be completing during the Big Data Hadoop Course?

    The Hadoop Training course includes five real-life, industry-based projects. Successful evaluation of one of the following two projects is a part of the certification eligibility criteria.

    Project 1
    Domain- Banking

    Description: A Portuguese banking institution ran a marketing campaign to convince potential customers to invest in a bank term deposit. Their marketing campaigns were conducted through phone calls, and sometimes the same customer was contacted more than once. Your job is to analyze the data collected from the marketing campaign.

    Project 2
    Domain- Telecommunication

    Description: A mobile phone service provider has launched a new Open Network campaign. The company has invited users to raise complaints about the towers in their locality if they face issues with their mobile network. The company has collected the dataset of users who raised a complaint. The fourth and the fifth field of the dataset has a latitude and longitude of users, which is important information for the company. You must find this latitude and longitude information on the basis of the available dataset and create three clusters of users with a k-means algorithm.

    For additional practice, we have three more projects to help you start your Hadoop and Spark journey.

    Project 3
    Domain- Social Media

    Description: As part of a recruiting exercise, a major social media company asked candidates to analyze a dataset from Stack Exchange. You will be using the dataset to arrive at certain key insights.

    Project 4
    Domain- Website providing movie-related information

    Description: IMDB is an online database of movie-related information. IMDB users rate movies on a scale of 1 to 5 -- 1 being the worst and 5 being the best -- and provide reviews. The dataset also has additional information, such as the release year of the movie. You are tasked to analyze the data collected.

    Project 5
    Domain- Insurance

    Description: A US-based insurance provider has decided to launch a new medical insurance program targeting various customers. To help a customer understand the market better, you must perform a series of data analyses using Hadoop.

     

  • How Big Data Hadoop Training help's with your career?

    The field of big data and analytics is a dynamic one, adapting rapidly as technology evolves over time. Those professionals who take the initiative and excel in big data and analytics are well-positioned to keep pace with changes in the technology space and fill growing job opportunities. Some trends in big data include: 

    Global Hadoop Market to Reach $84.6 Billion by 2021 – Allied Market Research
    Shortage of 1.4 -1.9 million Hadoop Data Analysts in the US alone by 2018– McKinsey
    Hadoop Administrators in the US receive salaries of up to $123,000 – indeed.com

  • How will you work on the projects?

    You will use Simplilearn’s CloudLab to complete projects.

  • What is the market trend for Big Data Hadoop in Brisbane City?

    According to TeamLease, a staffing solutions company, a data scientist with an average working experience of about 5 years has the potential to earn about 4 lakhs AUD per annum, while CAs with the same level of experience earn about 50K AUD and engineers earn 25K AUD . If this salary trend is anything to go by, then the demand for data professionals has never been higher. 

    In 2017, a report by Analytics AU Magazine highlighted the rising trend of data-oriented jobs. According to this report, the number of big data jobs in Australia almost doubled in 2017, and over 50,000 positions are yet to be filled. With regard to Australian cities, Brisbane holds the title of having the highest number of analytics jobs in India. In 2017, over 25% of all analytics jobs originated in Brisbane. 

  • What are the top companies that offer Big Data Hadoop Jobs in Brisbane?

    According to Linkedin , The top companies looking out to hire the skilled Big Data Hadoop Professionals in Brisbane are:

    EMC, IBM, Microsoft, Accenture, Dassault, SAS, Equifax, and etc

  • What is the average salary for a Big Data Hadoop Developer in Brisbane?

    The annual average salary of Big Data Developer in Brisbane is 55k AUD as specified by PayScale and can earn up to 2 lakhs AUD per annum with realtime experience. With the CCA175 Or Cloudera Big Data Hadoop Developer certification, professionals can earn about 20% more than their uncertified developers.

     

  • What are the System Requirements?

    The tools you’ll need to attend training are:
    • Windows: Windows XP SP3 or higher
    • Mac: OSX 10.6 or higher
    • Internet speed: Preferably 512 Kbps or higher
    • Headset, speakers, and microphone: You’ll need headphones or speakers to hear instructions clearly, as well as a microphone to talk to others. You can use a headset with a built-in microphone, or separate speakers and microphone.

  • What are the modes of training offered for this course?

    We offer this training in the following modes:

    We offer this training in the following modes:

    • Live Virtual Classroom or Online Classroom: Attend the course remotely from your desktop via video conferencing to increase productivity and reduce the time spent away from work or home.
    • Online Self-Learning: In this mode, you will access the video training and go through the course at your own convenience.

     

  • Can I cancel my enrolment? Do I get a refund?

    Yes, you can cancel your enrolment if necessary. We will refund the course price after deducting an administration fee. To learn more, you can view our Refund Policy.

  • Are there any group discounts for classroom training programs?

    Yes, we have group discount options for our training programs. Contact us using the form on the right of any page on the Simplilearn website, or select the Live Chat link. Our customer service representatives can provide more details.

  • What payment options are available?

    Payments can be made using any of the following options. You will be emailed a receipt after the payment is made.
    • Visa Credit or Debit Card
    • MasterCard
    • American Express
    • Diner’s Club
    • PayPal

  • I’d like to learn more about this training program. Whom should I contact?

    Contact us using the form on the right of any page on the Simplilearn website, or select the Live Chat link. Our customer service representatives will be able to give you more details.

  • Who are our faculties and how are they selected?

    All of our highly qualified trainers are industry experts with at least 10-12 years of relevant teaching experience in Big Data Hadoop. Each of them has gone through a rigorous selection process which includes profile screening, technical evaluation, and a training demo before they are certified to train for us. We also ensure that only those trainers with a high alumni rating continue to train for us.

  • What is Global Teaching Assistance?

    Our teaching assistants are a dedicated team of subject matter experts here to help you get certified in your first attempt. They engage students proactively to ensure the course path is being followed and help you enrich your learning experience, from class onboarding to project mentoring and job assistance. Teaching Assistance is available during business hours for this Big Data Hadoop training course.

  • What is covered under the 24/7 Support promise?

    We offer 24/7 support through email, chat, and calls. We also have a dedicated team that provides on-demand assistance through our community forum. What’s more, you will have lifetime access to the community forum, even after completion of your course with us to discuss Big Data and Hadoop topics.

  • If I am not from a Programming Background but have a basic knowledge of Programming, can I still learn Hadoop?

    Yes, you can learn Hadoop without being from a software background. We provide complimentary courses in Java and Linux so that you can brush up on your programming skills. This will help you in learning Hadoop technologies better and faster.

  • Can I switch from Self-Paced Training To Online Instructor-Led Training?

    Yes, if you would want to upgrade from the self-paced training to instructor-led training then you can easily do so by paying the difference of the fees amount and joining the next batch of classes which shall be separately notified to you.

  • What if I miss a class?

    • Simplilearn has Flexi-pass that lets you attend classes to blend in with your busy schedule and gives you an advantage of being trained by world-class faculty with decades of industry experience combining the best of online classroom training and self-paced learning
    • With Flexi-pass, Simplilearn gives you access to as many as 15 sessions for 90 days

  • What are the other top Big Data Certification Courses Simplilearn is offering?

    Keeping up with the Big Data & Analytics boom, Simplilearn has tailored very comprehensive Big Data certification programs which ensures a complete development as a Big Data professional.

    Few of the courses offered around Big Data are:

    In addition to the above, Simpliearn has created Big Data Hadoop Architect Masters Program on Big Data which follows a curated learning path.

    Simplilearn also offers the following Masters program with respect to Data Science and Business Intelligence:

  • What is online classroom training?

    Online classroom training for Big Data Hadoop course certification is conducted via online live streaming of each class. The classes are conducted by a Big Data Hadoop certified trainer with more than 15 years of work and training experience.

  • Is this live training, or will I watch pre-recorded videos?

    If you enroll for self-paced e-learning, you will have access to pre-recorded videos. If you enroll for the online classroom Flexi Pass, you will have access to live training conducted online as well as the pre-recorded videos.

  • Are the training and course material effective in preparing me for the Big Data Hadoop certification exam?

    Yes, Simplilearn’s training and course materials guarantee success in passing the Big Data Hadoop certification exam.

  • What certification will I receive after completing the training?

    After successful completion of the Big Data Hadoop course training, you will be awarded the course completion certificate from Simplilearn.

  • What is Big data and Hadoop?

    Big data refers to a collection of extensive data sets, which are so complex and broad that they can't be processed using traditional techniques. Hadoop is an open-source framework that allows organizations to store and process big data in a parallel and distributed environment. 
     

  • What is Spark?

    Spark is an open-source framework that provides several interconnected platforms, systems, and standards for big data projects. Spark is considered by many to be a more advanced product than Hadoop. 
     

  • How can beginners learn about big data and Hadoop?

    Hadoop is one of the leading technological frameworks being widely used to leverage big data in an organization. Taking your first step toward big data is really challenging. Therefore, we believe it’s important to learn the basics about the technology before you pursue your certification. Simplilearn provides free resource articles, tutorials, and YouTube videos to help you to understand the Hadoop ecosystem and cover your basics. Our extensive course on Big Data Hadoop and Spark Developer will get you started with big data.
     

Our Brisbane Correspondence / Mailing address

Simplilearn Americas, Inc, Level 7, 22, 23, 127 Creek Street, Brisbane, 4000, Australia, Call us at: 1-800-982-536

Find Big Data Hadoop Certification Training Course in other cities

MelbourneSydney
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.