Course Overview

Training Options

Self-Paced Learning

$ 399

  • Lifetime access to high-quality self-paced e-learning content curated by industry experts
  • 24x7 learner assistance and support

Corporate Training

Customized to your team's needs

  • Blended learning delivery model (self-paced eLearning and/or instructor-led options)
  • Flexible pricing options
  • Enterprise grade Learning Management System (LMS)
  • Enterprise dashboards for individuals and teams
  • 24x7 learner assistance and support

Course Curriculum

Course Content

  • CS-Apache Spark and Scala Certification Training

    Preview
    • Lesson 00 - Course Overview

      04:08Preview
      • 0.01 Introduction
        00:11
      • 0.02 Course Objectives
        00:28
      • 0.03 Course Overview
        00:36
      • 0.04 Target Audience
        00:31
      • 0.05 Course Prerequisites
        00:21
      • 0.06 Value to the Professionals
        00:48
      • 0.07 Value to the Professionals (contd.)
        00:20
      • 0.08 Value to the Professionals (contd.)
        00:21
      • 0.09 Lessons Covered
        00:24
      • 0.10 Conclusion
        00:08
    • Lesson 01 - Introduction to Spark

      28:36Preview
      • 1.01 Introduction
        00:14
      • 1.02 Objectives
        00:26
      • 1.03 Need of New Generation Distributed Systems
        01:15
      • 1.04 Limitations of MapReduce in Hadoop
        01:05
      • 1.05 Limitations of MapReduce in Hadoop (contd.)
        01:06
      • 1.06 Batch vs. Real-Time Processing
        01:09
      • 1.07 Application of Stream Processing
        01:05
      • 1.08 Application of In-Memory Processing
        01:47
      • 1.09 Introduction to Apache Spark
        00:44
      • 1.10 Components of a Spark Project
        02:21
      • 1.11 History of Spark
        00:50
      • 1.12 Language Flexibility in Spark
        00:54
      • 1.13 Spark Execution Architecture
        01:13
      • 1.14 Automatic Parallelization of Complex Flows
        00:58
      • 1.15 Automatic Parallelization of Complex Flows-Important Points
        01:13
      • 1.16 APIs That Match User Goals
        01:05
      • 1.17 Apache Spark-A Unified Platform of Big Data Apps
        01:37
      • 1.18 More Benefits of Apache Spark
        01:05
      • 1.19 Running Spark in Different Modes
        00:40
      • 1.20 Installing Spark as a Standalone Cluster-Configurations
        00:07
      • 1.21 Demo-Install Apache Spark
        00:07
      • 1.22 Demo-Install Apache Spark
        02:40
      • 1.23 Overview of Spark on a Cluster
        00:47
      • 1.24 Tasks of Spark on a Cluster
        00:36
      • 1.25 Companies Using Spark-Use Cases
        00:46
      • 1.26 Hadoop Ecosystem vs. Apache Spark
        00:31
      • 1.27 Hadoop Ecosystem vs. Apache Spark (contd.)
        00:42
      • 1.28 Summary
        00:39
      • 1.29 Summary (contd.)
        00:41
      • 1.30 Conclusion
        00:13
    • Lesson 02 - Introduction to Programming in Scala

      37:14Preview
      • 2.01 Introduction
        00:10
      • 2.02 Objectives
        00:16
      • 2.03 Introduction to Scala
        01:32
      • 2.04 Basic Data Types
        00:24
      • 2.05 Basic Literals
        00:34
      • 2.06 Basic Literals (contd.)
        00:24
      • 2.07 Basic Literals (contd.)
        00:21
      • 2.08 Introduction to Operators
        00:31
      • 2.09 Use Basic Literals and the Arithmetic Operator
        00:07
      • 2.10 Demo Use Basic Literals and the Arithmetic Operator
        03:17
      • 2.11 Use the Logical Operator
        00:07
      • 2.12 Demo Use the Logical Operator
        01:40
      • 2.13 Introduction to Type Inference
        00:33
      • 2.14 Type Inference for Recursive Methods
        00:09
      • 2.15 Type Inference for Polymorphic Methods and Generic Classes
        00:30
      • 2.16 Unreliability on Type Inference Mechanism
        00:22
      • 2.17 Mutable Collection vs. Immutable Collection
        01:13
      • 2.18 Functions
        00:21
      • 2.19 Anonymous Functions
        00:21
      • 2.20 Objects
        01:07
      • 2.21 Classes
        00:36
      • 2.22 Use Type Inference, Functions, Anonymous Function, and Class
        00:09
      • 2.23 Demo Use Type Inference, Functions, Anonymous Function and Class
        07:39
      • 2.24 Traits as Interfaces
        00:57
      • 2.25 Traits-Example
        00:08
      • 2.26 Collections
        00:41
      • 2.27 Types of Collections
        00:25
      • 2.28 Types of Collections (contd.)
        00:26
      • 2.29 Lists
        00:28
      • 2.30 Perform Operations on Lists
        00:07
      • 2.31 Demo Use Data Structures
        04:09
      • 2.32 Maps
        00:45
      • 2.33 Pattern Matching
        00:33
      • 2.34 Implicits
        00:36
      • 2.35 Implicits (contd.)
        00:17
      • 2.36 Streams
        00:21
      • 2.37 Use Data Structures
        00:07
      • 2.38 Demo Perform Operations on Lists
        03:24
      • 2.39 Summary
        00:37
      • 2.40 Summary (contd.)
        00:36
      • 2.41 Conclusion
        00:14
    • Lesson 03 - Using RDD for Creating Applications in Spark

      50:28Preview
      • 3.01 Introduction
        00:10
      • 3.02 Objectives
        00:22
      • 3.03 RDDs API
        01:39
      • 3.04 Creating RDDs
        00:36
      • 3.05 Creating RDDs-Referencing an External Dataset
        00:18
      • 3.06 Referencing an External Dataset-Text Files
        00:51
      • 3.07 Referencing an External Dataset-Text Files (contd.)
        00:49
      • 3.08 Referencing an External Dataset-Sequence Files
        00:32
      • 3.09 Referencing an External Dataset-Other Hadoop Input Formats
        00:46
      • 3.10 Creating RDDs-Important Points
        01:08
      • 3.11 RDD Operations
        00:37
      • 3.12 RDD Operations-Transformations
        00:47
      • 3.13 Features of RDD Persistence
        00:57
      • 3.14 Storage Levels Of RDD Persistence
        00:19
      • 3.15 Invoking the Spark Shell
        00:22
      • 3.16 Importing Spark Classes
        00:14
      • 3.17 Creating the SparkContext
        00:25
      • 3.18 Loading a File in Shell
        00:10
      • 3.19 Performing Some Basic Operations on Files in Spark Shell RDDs
        00:20
      • 3.20 Packaging a Spark Project with SBT
        00:50
      • 3.21 Running a Spark Project With SBT
        00:31
      • 3.22 Demo-Build a Scala Project
        00:06
      • 3.23 Build a Scala Project
        06:50
      • 3.24 Demo-Build a Spark Java Project
        00:07
      • 3.25 Build a Spark Java Project
        04:31
      • 3.26 Shared Variables-Broadcast
        01:20
      • 3.27 Shared Variables-Accumulators
        00:51
      • 3.28 Writing a Scala Application
        00:20
      • 3.29 Demo-Run a Scala Application
        00:07
      • 3.30 Run a Scala Application
        01:43
      • 3.31 Demo-Write a Scala Application Reading the Hadoop Data
        00:07
      • 3.32 Write a Scala Application Reading the Hadoop Data
        01:22
      • 3.33 Demo-Run a Scala Application Reading the Hadoop Data
        00:07
      • 3.34 Run a Scala Application Reading the Hadoop Data
        02:21
      • 3.35 DoubleRDD Methods
        00:08
      • 3.36 PairRDD Methods-Join
        00:46
      • 3.37 PairRDD Methods-Others
        00:06
      • 3.38 Java PairRDD Methods
        00:09
      • 3.39 Java PairRDD Methods (contd.)
        00:06
      • 3.40 General RDD Methods
        00:05
      • 3.41 General RDD Methods (contd.)
        00:05
      • 3.42 Java RDD Methods
        00:07
      • 3.43 Java RDD Methods (contd.)
        00:06
      • 3.44 Common Java RDD Methods
        00:09
      • 3.45 Spark Java Function Classes
        00:12
      • 3.46 Method for Combining JavaPairRDD Functions
        00:41
      • 3.47 Transformations in RDD
        00:33
      • 3.48 Other Methods
        00:07
      • 3.49 Actions in RDD
        00:08
      • 3.50 Key-Value Pair RDD in Scala
        00:31
      • 3.51 Key-Value Pair RDD in Java
        00:43
      • 3.52 Using MapReduce and Pair RDD Operations
        00:24
      • 3.53 Reading Text File from HDFS
        00:16
      • 3.54 Reading Sequence File from HDFS
        00:21
      • 3.55 Writing Text Data to HDFS
        00:18
      • 3.56 Writing Sequence File to HDFS
        00:12
      • 3.57 Using GroupBy
        00:07
      • 3.58 Using GroupBy (contd.)
        00:05
      • 3.59 Demo-Run a Scala Application Performing GroupBy Operation
        00:07
      • 3.60 Run a Scala Application Performing GroupBy Operation
        03:12
      • 3.61 Demo-Run a Scala Application Using the Scala Shell
        00:06
      • 3.62 Run a Scala Application Using the Scala Shell
        04:02
      • 3.63 Demo-Write and Run a Java Application
        00:06
      • 3.64 Write and Run a Java Application
        01:48
      • 3.65 Summary
        00:53
      • 3.66 Summary (contd.)
        00:59
      • 3.67 Conclusion
        00:15
    • Lesson 04 - Running SQL Queries Using Spark SQL

      39:25Preview
      • 4.01 Introduction
        00:10
      • 4.02 Objectives
        00:17
      • 4.03 Importance of Spark SQL
        01:01
      • 4.04 Benefits of Spark SQL
        00:47
      • 4.05 DataFrames
        00:50
      • 4.06 SQLContext
        00:50
      • 4.07 SQLContext (contd.)
        01:12
      • 4.08 Creating a DataFrame
        00:10
      • 4.09 Using DataFrame Operations
        00:21
      • 4.10 Using DataFrame Operations (contd.)
        00:05
      • 4.11 Demo-Run SparkSQL with a Daraframe
        00:06
      • 4.12 Run SparkSQL with a Dataframe
        08:52
      • 4.13 Using the Reflection-Based Approach
        00:38
      • 4.14 Using the Reflection-Based Approach (contd.)
        00:08
      • 4.15 Using the Programmatic Approach
        00:44
      • 4.16 Using the Programmatic Approach (contd.)
        00:06
      • 4.17 Demo-Run Spark SQL Programmatically
        00:08
      • 4.18 Run Spark SQL Programmatically
        09:20
      • 4.19 Save Modes
        00:31
      • 4.20 Saving to Persistent Tables
        00:45
      • 4.21 Parquet Files
        00:18
      • 4.22 Partition Discovery
        00:37
      • 4.23 Schema Merging
        00:28
      • 4.24 JSON Data
        00:34
      • 4.25 Hive Table
        00:45
      • 4.26 DML Operation-Hive Queries
        00:27
      • 4.27 Demo-Run Hive Queries Using Spark SQL
        00:06
      • 4.28 Run Hive Queries Using Spark SQL
        04:58
      • 4.29 JDBC to Other Databases
        00:49
      • 4.30 Supported Hive Features
        00:38
      • 4.31 Supported Hive Features (contd.)
        00:22
      • 4.32 Supported Hive Data Types
        00:13
      • 4.33 Case Classes
        00:14
      • 4.34 Case Classes (contd.)
        00:07
      • 4.35 Summary
        00:48
      • 4.36 Summary (contd.)
        00:48
      • 4.37 Conclusion
        00:12
    • Lesson 05 - Spark Streaming

      34:49Preview
      • 5.01 Introduction
        00:10
      • 5.02 Objectives
        00:14
      • 5.03 Introduction to Spark Streaming
        00:49
      • 5.04 Working of Spark Streaming
        00:19
      • 5.05 Streaming Word Count
        01:34
      • 5.06 Micro Batch
        00:19
      • 5.07 DStreams
        00:34
      • 5.08 DStreams (contd.)
        00:38
      • 5.09 Input DStreams and Receivers
        01:19
      • 5.10 Input DStreams and Receivers (contd.)
        00:54
      • 5.11 Basic Sources
        01:14
      • 5.12 Advanced Sources
        00:49
      • 5.13 Transformations on DStreams
        00:15
      • 5.14 Transformations on Dstreams (contd.)
        00:06
      • 5.15 Output Operations on DStreams
        00:29
      • 5.16 Design Patterns for Using ForeachRDD
        01:14
      • 5.17 DataFrame and SQL Operations
        00:25
      • 5.18 DataFrame and SQL Operations (contd.)
        00:20
      • 5.19 Checkpointing
        01:25
      • 5.20 Enabling Checkpointing
        00:39
      • 5.21 Socket Stream
        00:59
      • 5.22 File Stream
        00:11
      • 5.23 Stateful Operations
        00:28
      • 5.24 Window Operations
        01:22
      • 5.25 Types of Window Operations
        00:11
      • 5.26 Types of Window Operations Types (contd.)
        00:05
      • 5.27 Join Operations-Stream-Dataset Joins
        00:20
      • 5.28 Join Operations-Stream-Stream Joins
        00:33
      • 5.29 Monitoring Spark Streaming Application
        01:18
      • 5.30 Performance Tuning-High Level
        00:20
      • 5.31 Demo-Capture and Process the Netcat Data
        00:07
      • 5.32 Capture and Process the Netcat Data
        05:00
      • 5.33 Demo-Capture and Process the Flume Data
        00:07
      • 5.34 Capture and Process the Flume Data
        05:08
      • 5.35 Demo-Capture the Twitter Data
        00:06
      • 5.36 Capture the Twitter Data
        02:33
      • 5.37 Summary
        01:00
      • 5.38 Summary (contd.)
        01:04
      • 5.39 Conclusion
        00:11
    • Lesson 06 - Spark ML Programming

      39:49Preview
      • 6.01 Introduction
        00:10
      • 6.02 Objectives
        00:19
      • 6.03 Introduction to Machine Learning
        01:35
      • 6.04 Applications of Machine Learning
        00:21
      • 6.05 Machine Learning in Spark
        00:33
      • 6.06 DataFrames
        00:32
      • 6.07 Transformers and Estimators
        00:59
      • 6.08 Pipeline
        00:48
      • 6.09 Working of a Pipeline
        01:41
      • 6.10 Working of a Pipeline (contd.)
        00:44
      • 6.11 DAG Pipelines
        00:33
      • 6.12 Runtime Checking
        00:20
      • 6.13 Parameter Passing
        00:59
      • 6.14 General Machine Learning Pipeline-Example
        00:05
      • 6.15 Model Selection via Cross-Validation
        01:15
      • 6.16 Supported Types, Algorithms, and Utilities
        00:30
      • 6.17 Data Types
        01:25
      • 6.18 Feature Extraction and Basic Statistics
        00:42
      • 6.19 Clustering
        00:37
      • 6.20 K-Means
        00:55
      • 6.21 K-Means (contd.)
        00:05
      • 6.22 Demo-Perform Clustering Using K-Means
        00:07
      • 6.23 Perform Clustering Using K-Means
        04:41
      • 6.24 Gaussian Mixture
        00:57
      • 6.25 Power Iteration Clustering (PIC)
        01:16
      • 6.26 Latent Dirichlet Allocation (LDA)
        00:34
      • 6.27 Latent Dirichlet Allocation (LDA) (contd.)
        01:45
      • 6.28 Collaborative Filtering
        01:13
      • 6.29 Classification
        00:16
      • 6.30 Classification (contd.)
        00:06
      • 6.31 Regression
        00:41
      • 6.32 Example of Regression
        00:56
      • 6.33 Demo-Perform Classification Using Linear Regression
        00:08
      • 6.34 Perform Classification Using Linear Regression
        02:00
      • 6.35 Demo-Run Linear Regression
        00:06
      • 6.36 Run Linear Regression
        02:14
      • 6.37 Demo-Perform Recommendation Using Collaborative Filtering
        00:05
      • 6.38 Perform Recommendation Using Collaborative Filtering
        02:23
      • 6.39 Demo-Run Recommendation System
        00:06
      • 6.40 Run Recommendation System
        02:45
      • 6.41 Summary
        01:14
      • 6.42 Summary (contd.)
        00:57
      • 6.43 Conclusion
        00:11
    • Lesson 07 - Spark GraphX Programming

      28:04Preview
      • 7.01 Introduction
        00:12
      • 7.02 Objectives
        00:17
      • 7.03 Introduction to Graph-Parallel System
        01:13
      • 7.04 Limitations of Graph-Parallel System
        00:49
      • 7.05 Introduction to GraphX
        01:21
      • 7.06 Introduction to GraphX (contd.)
        00:06
      • 7.07 Importing GraphX
        00:10
      • 7.08 The Property Graph
        01:25
      • 7.09 The Property Graph (contd.)
        00:06
      • 7.10 Creating a Graph
        00:13
      • 7.11 Demo-Create a Graph Using GraphX
        00:07
      • 7.12 Create a Graph Using GraphX
        10:08
      • 7.13 Triplet View
        00:30
      • 7.14 Graph Operators
        00:50
      • 7.15 List of Operators
        00:23
      • 7.16 List of Operators (contd.)
        00:05
      • 7.17 Property Operators
        00:18
      • 7.18 Structural Operators
        01:02
      • 7.19 Subgraphs
        00:21
      • 7.20 Join Operators
        01:09
      • 7.21 Demo-Perform Graph Operations Using GraphX
        00:07
      • 7.22 Perform Graph Operations Using GraphX
        05:46
      • 7.23 Demo-Perform Subgraph Operations
        00:07
      • 7.27 Demo-Perform MapReduce Operations
        00:08
      • 7.29 Counting Degree of Vertex
        00:32
      • 7.30 Collecting Neighbors
        00:28
      • 7.36 Conclusion
        00:11

Why Online Bootcamp

  • Develop skills for real career growthCutting-edge curriculum designed in guidance with industry and academia to develop job-ready skills
  • Learn from experts active in their field, not out-of-touch trainersLeading practitioners who bring current best practices and case studies to sessions that fit into your work schedule.
  • Learn by working on real-world problemsCapstone projects involving real world data sets with virtual labs for hands-on learning
  • Structured guidance ensuring learning never stops24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts
  • Disclaimer
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.