Big-Data and Hadoop Certification Training in Washington, District Of Columbia

Key features

MONEY BACK GUARANTEE

How this works :

At Simplilearn, we greatly value the trust of our patrons. Our courses were designed to deliver an effective learning experience, and have helped over half a million find their professional calling. But if you feel your course is not to your liking, we offer a 7-day money-back guarantee. Just send us a refund request within 7 days of purchase, and we will refund 100% of your payment, no questions asked!

For Self Placed Learning :

Raise refund request within 7 days of purchase of course. Money back guarantee is void if the participant has accessed more than 25% content.

For Instructor Led Training :

Raise refund request within 7 days of commencement of the first batch you are eligible to attend. Money back guarantee is void if the participant has accessed more than 25% content of an e-learning course or has attended Online Classrooms for more than 1 day.

  • 40 hours of instructor-led training
  • 24 hours of self-paced video
  • 5 real-life industry projects in banking, telecom, insurance, and e-commerce domains
  • Hands-on practice with CloudLabs
  • Includes training on Yarn, MapReduce, Pig, Hive, Impala, HBase, and Apache Spark
  • Aligned to Cloudera CCA175 certification exam

Course description

  • What’s the focus of this course?

    The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.

    Mastering Hadoop and related tools: The course provides you with an in-depth understanding of the Hadoop framework including HDFS, YARN, and MapReduce. You will learn to use Pig, Hive, and Impala to process and analyze large datasets stored in the HDFS, and use Sqoop and Flume for data ingestion.

    Mastering real-time data processing using Spark: You will learn to do functional programming in Spark, implement Spark applications, understand parallel processing in Spark, and use Spark RDD optimization techniques. You will also learn the various interactive algorithm in Spark and use Spark SQL for creating, transforming, and querying data form.

    As a part of the course, you will be required to execute real-life industry-based projects using CloudLab. The projects included are in the domains of Banking, Telecommunication, Social media, Insurance, and E-commerce.  This Big Data course also prepares you for the Cloudera CCA175 certification.

  • What are the course objectives?

    This course will enable you to:
    • Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
    • Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
    • Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
    • Get an overview of Sqoop and Flume and describe how to ingest data using them
    • Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
    • Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
    • Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
    • Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
    • Gain a working knowledge of Pig and its components
    • Do functional programming in Spark
    • Understand resilient distribution datasets (RDD) in detail
    • Implement and build Spark applications
    • Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
    • Understand the common use-cases of Spark and the various interactive algorithms
    • Learn Spark SQL, creating, transforming, and querying Data frames
    • Prepare for Cloudera Big Data CCA175 certification

  • Who should take this course?

    Big Data career opportunities are on the rise, and Hadoop is quickly becoming a must-know technology for the following professionals:
    • Software Developers and Architects
    • Analytics Professionals
    • Senior IT professionals
    • Testing and Mainframe professionals
    • Data Management Professionals
    • Business Intelligence Professionals
    • Project Managers
    • Aspiring Data Scientists
    • Graduates looking to build a career in Big Data Analytics
    Prerequisite:
    • As the knowledge of Java is necessary for this course, we are providing a complimentary access to “Java Essentials for Hadoop” course 
    • For Spark we use Python and Scala and an Ebook has been provided to help you with the same
    • Knowledge of an operating system like Linux is useful for the course

  • What is CloudLab?

    CloudLab is a cloud-based Hadoop and Spark environment lab that Simplilearn offers along with the course to ensure a hassle-free execution of the hands-on project which you need to complete in the Hadoop and Spark Developer course.

    With CloudLab, you do not need to install and maintain Hadoop or Spark on a virtual machine. Instead, you’ll be able to access a preconfigured environment on CloudLab via your browser. This provides a very strong semblance to what companies are using today to increase their Hadoop installation scalability and availability.

    You’ll have access to CloudLab from the Simplilearn LMS (Learning Management System) for the duration of the course. You can learn more about CloudLab by viewing our CloudLab video.

  • What projects are included in this course?

    The course includes 5 real-life, industry-based projects. CloudLab has been provided for a hassle-free execution of these projects. Successful evaluation of one of the following 2 projects is a part of the certification eligibility criteria.

    Project 1
    Domain- Banking
    Description- A Portuguese banking institution ran a marketing campaign to convince potential customers to invest in bank term deposit. The marketing campaigns were based on phone calls. Often, the same customer was contacted more than once through phone, in order to assess if they would want to subscribe to the bank term deposit or not. You have to analyze the data collected through the marketing campaign.

    Project 2
    Domain- Telecommunication
    Description- A mobile phone service provider has introduced a new Open Network campaign. The company has invited the users to raise a request to initiate a complaint about the towers in their locality if they face issues with their mobile network. The company has collected the dataset of users who had raised the complaint. The fourth and the fifth field of the dataset has latitude and longitude of users which is an important information for the company. You have to find this information of latitude and longitude on the basis of available dataset and create three clusters of users with k-means algorithm.

    For further practice, we have three more projects to help you start your Hadoop and Spark journey.

    Project 3
    Domain- Social Media
    Description- As part of a recruiting exercise, a major social media company asked candidates to analyze data set from Stack Exchange.
    You will be using the data set to arrive at certain key insights.

    Project 4
    Domain- Website providing movie-related information
    Description-IMBD is an online database of movie-related information. IMBD users rate the movies and provide reviews. They rate the movies on a scale of 1 to 5; 1 being the worst and 5 being the best. The data set also has additional information, such as the release year of the movie. You have to analyze the data collected.

    Project 5
    Domain- Insurance
    Description-A US-based insurance provider has decided to launch a new medical insurance program targeting various customers. To help a customer understand the current realities and the market better, you have to perform a series of data analysis using Hadoop.

Course preview

    • Lesson 00 - Course Introduction 04:10
      • 0.1 Introduction 04:10
    • Lesson 01 - Introduction to Big data and Hadoop Ecosystem 15:43
      • 1.1 Introduction 00:38
      • 1.2 Overview to Big Data and Hadoop 05:13
      • 1.3 Pop Quiz
      • 1.4 Hadoop Ecosystem 08:57
      • 1.5 Quiz
      • 1.6 Key Takeaways 00:55
    • Lesson 02 - HDFS and YARN 47:08
      • 2.1 Introduction 06:10
      • 2.2 HDFS Architecture and Components 08:59
      • 2.3 Pop Quiz
      • 2.4 Block Replication Architecture 09:53
      • 2.5 YARN Introduction 21:25
      • 2.6 Quiz
      • 2.7 Key Takeaways 00:41
      • 2.8 Hands-on Exercise
    • Lesson 03 - MapReduce and Scoop 57:00
      • 3.1 Introduction 00:41
      • 3.2 Why Mapreduce 11:57
      • 3.3 Small Data and Big Data 15:53
      • 3.4 Pop Quiz
      • 3.5 Data Types in Hadoop 04:23
      • 3.6 Joins in MapReduce 04:43
      • 3.7 What is Sqoop 18:21
      • 3.8 Quiz
      • 3.9 Key Takeaways 01:02
      • 3.10 Hands-on Exercise
    • Lesson 04 - Basics of Hive and Impala 19:00
      • 4.1 Introduction 04:07
      • 4.2 Pop Quiz
      • 4.3 Interacting with Hive and Impala 14:07
      • 4.4 Quiz
      • 4.5 Key Takeaways 00:46
    • Lesson 05 - Working with Hive and Impala 28:36
      • 5.1 Working with Hive and Impala 07:08
      • 5.2 Pop Quiz
      • 5.3 Data Types in Hive 07:47
      • 5.4 Validation of Data 07:47
      • 5.5 What is Hcatalog and Its Uses 05:25
      • 5.6 Quiz
      • 5.7 Key Takeaways 00:29
      • 5.8 Hands-on Exercise
    • Lesson 06 - Types of Data Formats 14:35
      • 6.1 Introduction 00:44
      • 6.2 Types of File Format 02:35
      • 6.3 Pop Quiz
      • 6.4 Data Serialization 03:11
      • 6.5 Importing MySql and Creating hivetb 04:32
      • 6.6 Parquet With Sqoop 02:37
      • 6.7 Quiz
      • 6.8 Key Takeaways 00:56
      • 6.9 Hands-on Exercise
    • Lesson 07 - Advanced Hive Concept and Data File Partitioning 17:00
      • 7.1 Introduction 07:41
      • 7.2 Pop Quiz
      • 7.3 Overview of the Hive Query Language 08:18
      • 7.4 Quiz
      • 7.5 Key Takeaways 01:01
      • 7.6 Hands-on Exercise
    • Lesson 08 - Apache Flume and HBase 28:06
      • 8.1 Introduction 12:29
      • 8.2 Pop Quiz
      • 8.3 Introduction to HBase 14:40
      • 8.4 Quiz
      • 8.5 Key Takeaways 00:57
      • 8.6 Hands-on Exercise
    • Lesson 09 - Pig 18:08
      • 9.1 Introduction 10:45
      • 9.2 Pop Quiz
      • 9.3 Getting Datasets for Pig Development 06:45
      • 9.4 Quiz
      • 9.5 Key Takeaways 00:38
      • 9.6 Hands-on Exercise
    • Lesson 10 - Basics of Apache Spark 39:54
      • 10.1 Introduction 16:04
      • 10.2 Spark - Architecture, Execution, and Related Concepts 07:10
      • 10.3 Pop Quiz
      • 10.4 RDD Operations 10:39
      • 10.5 Functional Programming in Spark 05:34
      • 10.6 Quiz
      • 10.7 Key Takeaways 00:27
      • 10.8 Hands-on Exercise
    • Lesson 11 - RDDs in Spark 16:09
      • 11.1 Introduction 00:46
      • 11.2 RDD Data Types and RDD Creation 10:14
      • 11.3 Pop Quiz
      • 11.4 Operations in RDDs 04:35
      • 11.5 Quiz
      • 11.6 Key Takeaways 00:34
      • 11.7 Hands-on Exercise
    • Lesson 12 - Implementation of Spark Applications 13:54
      • 12.1 Introduction 03:57
      • 12.2 Running Spark on YARN 01:27
      • 12.3 Pop Quiz
      • 12.4 Running a Spark Application 01:47
      • 12.5 Dynamic Resource Allocation 01:06
      • 12.6 Configuring Your Spark Application 04:24
      • 12.7 Quiz
      • 12.8 Key Takeaways 01:13
    • Lesson 13 - Spark Parallel Processing 08:40
      • 13.1 Introduction 05:41
      • 13.2 Pop Quiz
      • 13.3 Parallel Operations on Partitions 02:28
      • 13.4 Quiz
      • 13.5 Key Takeaways 00:31
      • 13.6 Hands-on Exercise
    • Lesson 14 - Spark RDD Optimization Techniques 14:23
      • 14.1 Introduction 04:40
      • 14.2 Pop Quiz
      • 14.3 RDD Persistence 08:59
      • 14.4 Quiz
      • 14.5 Key Takeaways 00:44
      • 14.6 Hands-on Exercise
    • Lesson 15 - Spark Algorithm 27:09
      • 15.1 Introduction 00:49
      • 15.2 Spark: An Iterative Algorithm 03:13
      • 15.3 Introduction To Graph Parallel System 02:34
      • 15.4 Pop Quiz
      • 15.5 Introduction To Machine Learning 10:27
      • 15.6 Introduction To Three C's 08:07
      • 15.7 Quiz
      • 15.8 Key Takeaways 01:59
    • What’s next? 05:28
      • The Next Step 05:28
    • Lesson 16 - Spark SQL 13:21
      • 16.1 Introduction 06:36
      • 16.2 Pop Quiz
      • 16.3 Interoperating with RDDs 06:08
      • 16.4 Quiz
      • 16.5 Key Takeaways 00:37
      • 16.6 Hands-on Exercise
    • Projects
      • Project For Submission
      • Projects with solutions
    • Simulation Test Paper Instructions 00:20
      • Instructions 00:20
    • Course Feedback
      • Course Feedback
    • Lesson 01 - Essentials of Java for Hadoop 31:10
      • 1.1 Essentials of Java for Hadoop 00:19
      • 1.2 Lesson Objectives 00:24
      • 1.3 Java Definition 00:27
      • 1.4 Java Virtual Machine (JVM) 00:34
      • 1.5 Working of Java 01:01
      • 1.6 Running a Basic Java Program 00:56
      • 1.7 Running a Basic Java Program (contd.) 01:15
      • 1.8 Running a Basic Java Program in NetBeans IDE 00:11
      • 1.9 BASIC JAVA SYNTAX 00:12
      • 1.10 Data Types in Java 00:26
      • 1.11 Variables in Java 01:31
      • 1.12 Naming Conventionsof Variables 01:21
      • 1.13 Type Casting. 01:05
      • 1.14 Operators 00:30
      • 1.15 Mathematical Operators 00:28
      • 1.16 Unary Operators. 00:15
      • 1.17 Relational Operators 00:19
      • 1.18 Logical or Conditional Operators 00:19
      • 1.19 Bitwise Operators 01:21
      • 1.20 Static Versus Non Static Variables 00:54
      • 1.21 Static Versus Non Static Variables (contd.) 00:17
      • 1.22 Statements and Blocks of Code 01:21
      • 1.23 Flow Control 00:47
      • 1.24 If Statement 00:40
      • 1.25 Variants of if Statement 01:07
      • 1.26 Nested If Statement 00:40
      • 1.27 Switch Statement 00:36
      • 1.28 Switch Statement (contd.) 00:34
      • 1.29 Loop Statements 01:19
      • 1.30 Loop Statements (contd.) 00:49
      • 1.31 Break and Continue Statements 00:44
      • 1.32 Basic Java Constructs 01:09
      • 1.33 Arrays 01:16
      • 1.34 Arrays (contd.) 01:07
      • 1.35 JAVA CLASSES AND METHODS 00:09
      • 1.36 Classes 00:46
      • 1.37 Objects 01:21
      • 1.38 Methods 01:01
      • 1.39 Access Modifiers 00:49
      • 1.40 Summary 00:41
      • 1.41 Thank You 00:09
    • Lesson 02 - Java Constructors 21:31
      • 2.1 Java Constructors 00:22
      • 2.2 Objectives 00:42
      • 2.3 Features of Java 01:08
      • 2.4 Classes Objects and Constructors 01:19
      • 2.5 Constructors 00:34
      • 2.6 Constructor Overloading 01:08
      • 2.7 Constructor Overloading (contd.) 00:28
      • 2.8 PACKAGES 00:09
      • 2.9 Definition of Packages 01:12
      • 2.10 Advantages of Packages 00:29
      • 2.11 Naming Conventions of Packages 00:28
      • 2.12 INHERITANCE 00:09
      • 2.13 Definition of Inheritance 01:07
      • 2.14 Multilevel Inheritance 01:15
      • 2.15 Hierarchical Inheritance 00:23
      • 2.16 Method Overriding 00:55
      • 2.17 Method Overriding(contd.) 00:35
      • 2.18 Method Overriding(contd.) 00:15
      • 2.19 ABSTRACT CLASSES 00:10
      • 2.20 Definition of Abstract Classes 00:41
      • 2.21 Usage of Abstract Classes 00:36
      • 2.22 INTERFACES 00:08
      • 2.23 Features of Interfaces 01:03
      • 2.24 Syntax for Creating Interfaces 00:24
      • 2.25 Implementing an Interface 00:23
      • 2.26 Implementing an Interface(contd.) 00:13
      • 2.27 INPUT AND OUTPUT 00:14
      • 2.28 Features of Input and Output 00:49
      • 2.29 System.in.read() Method 00:20
      • 2.30 Reading Input from the Console 00:31
      • 2.31 Stream Objects 00:21
      • 2.32 String Tokenizer Class 00:43
      • 2.33 Scanner Class 00:32
      • 2.34 Writing Output to the Console 00:28
      • 2.35 Summary 01:03
      • 2.36 Thank You 00:14
    • Lesson 03 - Essential Classes and Exceptions in Java 28:37
      • 3.1 Essential Classes and Exceptions in Java 00:18
      • 3.2 Objectives 00:31
      • 3.3 The Enums in Java 01:00
      • 3.4 Program Using Enum 00:44
      • 3.5 ArrayList 00:41
      • 3.6 ArrayList Constructors 00:38
      • 3.7 Methods of ArrayList 01:02
      • 3.8 ArrayList Insertion 00:47
      • 3.9 ArrayList Insertion (contd.) 00:38
      • 3.10 Iterator 00:39
      • 3.11 Iterator (contd.) 00:33
      • 3.12 ListIterator 00:46
      • 3.13 ListIterator (contd.) 01:00
      • 3.14 Displaying Items Using ListIterator 00:32
      • 3.15 For-Each Loop 00:35
      • 3.16 For-Each Loop (contd.) 00:23
      • 3.17 Enumeration 00:30
      • 3.18 Enumeration (contd.) 00:25
      • 3.19 HASHMAPS 00:15
      • 3.20 Features of Hashmaps 00:56
      • 3.21 Hashmap Constructors 01:36
      • 3.22 Hashmap Methods 00:58
      • 3.23 Hashmap Insertion 00:44
      • 3.24 HASHTABLE CLASS 00:21
      • 3.25 Hashtable Class an Constructors 01:25
      • 3.26 Hashtable Methods 00:41
      • 3.27 Hashtable Methods 00:48
      • 3.28 Hashtable Insertion and Display 00:29
      • 3.29 Hashtable Insertion and Display (contd.) 00:22
      • 3.30 EXCEPTIONS 00:22
      • 3.31 Exception Handling 01:06
      • 3.32 Exception Classes 00:26
      • 3.33 User-Defined Exceptions 01:04
      • 3.34 Types of Exceptions 00:44
      • 3.35 Exception Handling Mechanisms 00:54
      • 3.36 Try-Catch Block 00:15
      • 3.37 Multiple Catch Blocks 00:40
      • 3.38 Throw Statement 00:33
      • 3.39 Throw Statement (contd.) 00:25
      • 3.40 User-Defined Exceptions 00:11
      • 3.41 Advantages of Using Exceptions 00:25
      • 3.42 Error Handling and finally block 00:30
      • 3.43 Summary 00:41
      • 3.44 Thank You 00:04
    • {{childObj.title}}
      • {{childObj.childSection.chapter_name}}
        • {{lesson.title}}
      • {{lesson.title}}

    View More

    View Less

Exam & certification FREE PRACTICE TEST

  • What do I need to do to unlock my Simplilearn certificate?

    Online Classroom:
    • Complete 1 project and 1 simulation test with a minimum score of 80%.
    Online Self-Learning:
    • Complete 1 project and 1 simulation test with a minimum score of 80%.

Reviews

Simplilearn is an excellent online platform for online trainings with flexible hours of training and well-planned course content with great depth and case studies. The most interesting part which differentiates Simplilearn from other online vendors is the quality of the customer service - 24 / 7. I would strongly recommend Simplilearn to all who are looking for a change in career. Enroll for the courses and get the experience to become a professional.

Read more Read less

I really like the content of the course and the way trainer relates it with real-life examples.

Dedication of the trainer towards answering each & every question of the trainees makes us feel great and the online session as real as a classroom session.

Read more Read less

The trainer was knowledgeable and patient in explaining things. Many things were significantly easier to grasp with a live interactive instructor. I also like that he went out of his way to send additional information and solutions after the class via email.

Read more Read less

Very knowledgeable trainer, appreciate the time slot as well… Loved everything so far. I am very excited…

Great approach for the core understanding of Hadoop. Concepts are repeated from different points of view, responding to audience. At the end of the class you understand it.

Read more Read less

The course is very informative and interactive and that is the best part of this training.

Very informative and active sessions. Trainer is easy going and very interactive.

The content is well designed and the instructor was excellent.

The trainer really went the extra mile to help me work along. Thanks

Excellent learning experience. The training was superb! Thanks Simplilearn for arranging such wonderful sessions.

This course has provided me both theoretical and practical knowledge.

The training was good in terms of explanation and clearing the concepts theoretically. The fundamentals were covered.

The Big Data course content was elaborate and the training was great.

The entire Big Data and Hadoop course content was completed and covered in-depth in 4 days. The training was good.

Course advisor

Sina Jamshidi Big Data Lead at Bell Labs

Sina has over 10 years of experience in the Technology field as a Big Data Architect at Bell Labs and a Platinum level trainer. Sina is a very passionate about building a Big Data education ecosystem and has been a contributor in a number of public and journal publications.

FAQs

  • What are the System Requirements?

    To do the projects just log on to CloudLabs in your LMS.

  • Who are the trainers?

    The trainings are delivered by highly qualified and certified instructors with relevant industry experience.

  • What are the modes of training offered for this course?

    We offer this training in the following modes:

    1. Live Virtual Classroom or Online Classroom: With online classroom training, you have the option to attend the course remotely from your desktop via video conferencing. This format saves productivity challenges and decreases your time spent away from work or home.
    2. Online Self-Learning: In this mode, you will receive the lecture videos and you can go through the course as per your convenience.

  • Can I cancel my enrolment? Do I get a refund?

    Yes, you can cancel your enrolment if necessary. We will refund the course price after deducting an administration fee. To learn more, you can view our Refund Policy.

  • Are there any group discounts for classroom training programs?

    Yes, we have group discount options for our training programs. Contact us using the form on the right of any page on the Simplilearn website, or select the Live Chat link. Our customer service representatives will be able to give you more details.

  • What payment options are available?

    Payments can be made using any of the following options. You will be emailed a receipt after the payment is made.
    • Visa Credit or Debit card
    • MasterCard
    • American Express
    • Diner’s Club
    • PayPal

  • I’d like to learn more about this training program. Who should I contact?

    Contact us using the form on the right of any page on the Simplilearn website, or select the Live Chat link. Our customer service representatives will be able to give you more details.

  • Who are our Faculties and how are they selected?

    All our trainers are working professionals and industry experts with at least 10-12 years of relevant teaching experience.

    Each of them have gone through a rigorous selection process which includes profile screening, technical evaluation, and training demo before they are certified to train for us.  

    We also ensure that only those trainers with a high alumni rating continue to train for us.

  • What is Global Teaching Assistance?

    Our teaching assistants are here to help you get certified in your first attempt.

    They are a dedicated team of subject matter experts to help you at every step and enrich your learning experience from class onboarding to project mentoring and job assistance.

    They engage with the students proactively to ensure the course path is followed.

    Teaching Assistance is available during business hours.

  • What is covered under the 24/7 Support promise?

    We offer 24/7 support through email, chat, and calls.  

    We also have a dedicated team that provides on demand assistance through our community forum. What’s more, you will have lifetime access to the community forum, even after completion of your course with us.

Contact Us

+1-844-532-7688

(Toll Free)

Request more information

For individuals
For business
Name*
Email*
Phone Number*
Your Message (Optional)
We are looking into your query.
Our consultants will get in touch with you soon.

A Simplilearn representative will get back to you in one business day.

First Name*
Last Name*
Email*
Phone Number*
Company*
Job Title*

Washington

Washington D.C is the National capital of the United States. Washington D.C is the most popular tourist destination around the globe. There are many historical landmarks in the city like the Lincoln Memorial, White House, Museums showcasing historical artifacts from different parts of the world and other famous political structures. Apart from the historical and political connection, this city also has great taste for performing arts, sports and other cultural activities. Washington D.C's economy is always robust with continuous increase in export of manufactured goods, more inflow of foreign investments in financial, service and IT firms. This creates a stable environment for professionals working in these sectors. Any experienced professionals can enjoy a better career growth by enrolling for job-related courses like PMP, ITIL, Scrum, Agile, Six Sigma, Cloud Computing and CISSP.

  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
/index/hidden/ - Never remove this line