Apache Spark and Scala Certification Training

438 Learners

The course enables you to master the essential skills in Apache Spark & Scala such as Real-time processing, Spark SQL, Spark streaming, Machine learning programming, GraphX programming, and Shell scripting spark. It includes a real life industry-based project on movie reviews to be performed in Spark. The course is best suited for data scientists, IT developers, and analysts.

World’s #1 Online Bootcamp

Course Overview

Training Options

Self-Paced Learning

$ 399

Lifetime access to high-quality self-paced e-learning content curated by industry experts
24x7 learner assistance and support

Corporate Training

Customized to your team's needs

Blended learning delivery model (self-paced eLearning and/or instructor-led options)
Flexible pricing options
Enterprise grade Learning Management System (LMS)
Enterprise dashboards for individuals and teams
24x7 learner assistance and support

Course Curriculum

Course Content

CS-Apache Spark and Scala Certification Training
Preview
- Lesson 00 - Course Overview
  04:08Preview
  - 0.01 Introduction
    00:11
  - 0.02 Course Objectives
    00:28
  - 0.03 Course Overview
    00:36
  - 0.04 Target Audience
    00:31
  - 0.05 Course Prerequisites
    00:21
  - 0.06 Value to the Professionals
    00:48
  - 0.07 Value to the Professionals (contd.)
    00:20
  - 0.08 Value to the Professionals (contd.)
    00:21
  - 0.09 Lessons Covered
    00:24
  - 0.10 Conclusion
    00:08
- Lesson 01 - Introduction to Spark
  28:36Preview
  - 1.01 Introduction
    00:14
  - 1.02 Objectives
    00:26
  - 1.03 Need of New Generation Distributed Systems
    01:15
  - 1.04 Limitations of MapReduce in Hadoop
    01:05
  - 1.05 Limitations of MapReduce in Hadoop (contd.)
    01:06
  - 1.06 Batch vs. Real-Time Processing
    01:09
  - 1.07 Application of Stream Processing
    01:05
  - 1.08 Application of In-Memory Processing
    01:47
  - 1.09 Introduction to Apache Spark
    00:44
  - 1.10 Components of a Spark Project
    02:21
  - 1.11 History of Spark
    00:50
  - 1.12 Language Flexibility in Spark
    00:54
  - 1.13 Spark Execution Architecture
    01:13
  - 1.14 Automatic Parallelization of Complex Flows
    00:58
  - 1.15 Automatic Parallelization of Complex Flows-Important Points
    01:13
  - 1.16 APIs That Match User Goals
    01:05
  - 1.17 Apache Spark-A Unified Platform of Big Data Apps
    01:37
  - 1.18 More Benefits of Apache Spark
    01:05
  - 1.19 Running Spark in Different Modes
    00:40
  - 1.20 Installing Spark as a Standalone Cluster-Configurations
    00:07
  - 1.21 Demo-Install Apache Spark
    00:07
  - 1.22 Demo-Install Apache Spark
    02:40
  - 1.23 Overview of Spark on a Cluster
    00:47
  - 1.24 Tasks of Spark on a Cluster
    00:36
  - 1.25 Companies Using Spark-Use Cases
    00:46
  - 1.26 Hadoop Ecosystem vs. Apache Spark
    00:31
  - 1.27 Hadoop Ecosystem vs. Apache Spark (contd.)
    00:42
  - 1.28 Summary
    00:39
  - 1.29 Summary (contd.)
    00:41
  - 1.30 Conclusion
    00:13
- Lesson 02 - Introduction to Programming in Scala
  37:14Preview
  - 2.01 Introduction
    00:10
  - 2.02 Objectives
    00:16
  - 2.03 Introduction to Scala
    01:32
  - 2.04 Basic Data Types
    00:24
  - 2.05 Basic Literals
    00:34
  - 2.06 Basic Literals (contd.)
    00:24
  - 2.07 Basic Literals (contd.)
    00:21
  - 2.08 Introduction to Operators
    00:31
  - 2.09 Use Basic Literals and the Arithmetic Operator
    00:07
  - 2.10 Demo Use Basic Literals and the Arithmetic Operator
    03:17
  - 2.11 Use the Logical Operator
    00:07
  - 2.12 Demo Use the Logical Operator
    01:40
  - 2.13 Introduction to Type Inference
    00:33
  - 2.14 Type Inference for Recursive Methods
    00:09
  - 2.15 Type Inference for Polymorphic Methods and Generic Classes
    00:30
  - 2.16 Unreliability on Type Inference Mechanism
    00:22
  - 2.17 Mutable Collection vs. Immutable Collection
    01:13
  - 2.18 Functions
    00:21
  - 2.19 Anonymous Functions
    00:21
  - 2.20 Objects
    01:07
  - 2.21 Classes
    00:36
  - 2.22 Use Type Inference, Functions, Anonymous Function, and Class
    00:09
  - 2.23 Demo Use Type Inference, Functions, Anonymous Function and Class
    07:39
  - 2.24 Traits as Interfaces
    00:57
  - 2.25 Traits-Example
    00:08
  - 2.26 Collections
    00:41
  - 2.27 Types of Collections
    00:25
  - 2.28 Types of Collections (contd.)
    00:26
  - 2.29 Lists
    00:28
  - 2.30 Perform Operations on Lists
    00:07
  - 2.31 Demo Use Data Structures
    04:09
  - 2.32 Maps
    00:45
  - 2.33 Pattern Matching
    00:33
  - 2.34 Implicits
    00:36
  - 2.35 Implicits (contd.)
    00:17
  - 2.36 Streams
    00:21
  - 2.37 Use Data Structures
    00:07
  - 2.38 Demo Perform Operations on Lists
    03:24
  - 2.39 Summary
    00:37
  - 2.40 Summary (contd.)
    00:36
  - 2.41 Conclusion
    00:14
- Lesson 03 - Using RDD for Creating Applications in Spark
  50:28Preview
  - 3.01 Introduction
    00:10
  - 3.02 Objectives
    00:22
  - 3.03 RDDs API
    01:39
  - 3.04 Creating RDDs
    00:36
  - 3.05 Creating RDDs-Referencing an External Dataset
    00:18
  - 3.06 Referencing an External Dataset-Text Files
    00:51
  - 3.07 Referencing an External Dataset-Text Files (contd.)
    00:49
  - 3.08 Referencing an External Dataset-Sequence Files
    00:32
  - 3.09 Referencing an External Dataset-Other Hadoop Input Formats
    00:46
  - 3.10 Creating RDDs-Important Points
    01:08
  - 3.11 RDD Operations
    00:37
  - 3.12 RDD Operations-Transformations
    00:47
  - 3.13 Features of RDD Persistence
    00:57
  - 3.14 Storage Levels Of RDD Persistence
    00:19
  - 3.15 Invoking the Spark Shell
    00:22
  - 3.16 Importing Spark Classes
    00:14
  - 3.17 Creating the SparkContext
    00:25
  - 3.18 Loading a File in Shell
    00:10
  - 3.19 Performing Some Basic Operations on Files in Spark Shell RDDs
    00:20
  - 3.20 Packaging a Spark Project with SBT
    00:50
  - 3.21 Running a Spark Project With SBT
    00:31
  - 3.22 Demo-Build a Scala Project
    00:06
  - 3.23 Build a Scala Project
    06:50
  - 3.24 Demo-Build a Spark Java Project
    00:07
  - 3.25 Build a Spark Java Project
    04:31
  - 3.26 Shared Variables-Broadcast
    01:20
  - 3.27 Shared Variables-Accumulators
    00:51
  - 3.28 Writing a Scala Application
    00:20
  - 3.29 Demo-Run a Scala Application
    00:07
  - 3.30 Run a Scala Application
    01:43
  - 3.31 Demo-Write a Scala Application Reading the Hadoop Data
    00:07
  - 3.32 Write a Scala Application Reading the Hadoop Data
    01:22
  - 3.33 Demo-Run a Scala Application Reading the Hadoop Data
    00:07
  - 3.34 Run a Scala Application Reading the Hadoop Data
    02:21
  - 3.35 DoubleRDD Methods
    00:08
  - 3.36 PairRDD Methods-Join
    00:46
  - 3.37 PairRDD Methods-Others
    00:06
  - 3.38 Java PairRDD Methods
    00:09
  - 3.39 Java PairRDD Methods (contd.)
    00:06
  - 3.40 General RDD Methods
    00:05
  - 3.41 General RDD Methods (contd.)
    00:05
  - 3.42 Java RDD Methods
    00:07
  - 3.43 Java RDD Methods (contd.)
    00:06
  - 3.44 Common Java RDD Methods
    00:09
  - 3.45 Spark Java Function Classes
    00:12
  - 3.46 Method for Combining JavaPairRDD Functions
    00:41
  - 3.47 Transformations in RDD
    00:33
  - 3.48 Other Methods
    00:07
  - 3.49 Actions in RDD
    00:08
  - 3.50 Key-Value Pair RDD in Scala
    00:31
  - 3.51 Key-Value Pair RDD in Java
    00:43
  - 3.52 Using MapReduce and Pair RDD Operations
    00:24
  - 3.53 Reading Text File from HDFS
    00:16
  - 3.54 Reading Sequence File from HDFS
    00:21
  - 3.55 Writing Text Data to HDFS
    00:18
  - 3.56 Writing Sequence File to HDFS
    00:12
  - 3.57 Using GroupBy
    00:07
  - 3.58 Using GroupBy (contd.)
    00:05
  - 3.59 Demo-Run a Scala Application Performing GroupBy Operation
    00:07
  - 3.60 Run a Scala Application Performing GroupBy Operation
    03:12
  - 3.61 Demo-Run a Scala Application Using the Scala Shell
    00:06
  - 3.62 Run a Scala Application Using the Scala Shell
    04:02
  - 3.63 Demo-Write and Run a Java Application
    00:06
  - 3.64 Write and Run a Java Application
    01:48
  - 3.65 Summary
    00:53
  - 3.66 Summary (contd.)
    00:59
  - 3.67 Conclusion
    00:15
- Lesson 04 - Running SQL Queries Using Spark SQL
  39:25Preview
  - 4.01 Introduction
    00:10
  - 4.02 Objectives
    00:17
  - 4.03 Importance of Spark SQL
    01:01
  - 4.04 Benefits of Spark SQL
    00:47
  - 4.05 DataFrames
    00:50
  - 4.06 SQLContext
    00:50
  - 4.07 SQLContext (contd.)
    01:12
  - 4.08 Creating a DataFrame
    00:10
  - 4.09 Using DataFrame Operations
    00:21
  - 4.10 Using DataFrame Operations (contd.)
    00:05
  - 4.11 Demo-Run SparkSQL with a Daraframe
    00:06
  - 4.12 Run SparkSQL with a Dataframe
    08:52
  - 4.13 Using the Reflection-Based Approach
    00:38
  - 4.14 Using the Reflection-Based Approach (contd.)
    00:08
  - 4.15 Using the Programmatic Approach
    00:44
  - 4.16 Using the Programmatic Approach (contd.)
    00:06
  - 4.17 Demo-Run Spark SQL Programmatically
    00:08
  - 4.18 Run Spark SQL Programmatically
    09:20
  - 4.19 Save Modes
    00:31
  - 4.20 Saving to Persistent Tables
    00:45
  - 4.21 Parquet Files
    00:18
  - 4.22 Partition Discovery
    00:37
  - 4.23 Schema Merging
    00:28
  - 4.24 JSON Data
    00:34
  - 4.25 Hive Table
    00:45
  - 4.26 DML Operation-Hive Queries
    00:27
  - 4.27 Demo-Run Hive Queries Using Spark SQL
    00:06
  - 4.28 Run Hive Queries Using Spark SQL
    04:58
  - 4.29 JDBC to Other Databases
    00:49
  - 4.30 Supported Hive Features
    00:38
  - 4.31 Supported Hive Features (contd.)
    00:22
  - 4.32 Supported Hive Data Types
    00:13
  - 4.33 Case Classes
    00:14
  - 4.34 Case Classes (contd.)
    00:07
  - 4.35 Summary
    00:48
  - 4.36 Summary (contd.)
    00:48
  - 4.37 Conclusion
    00:12
- Lesson 05 - Spark Streaming
  34:49Preview
  - 5.01 Introduction
    00:10
  - 5.02 Objectives
    00:14
  - 5.03 Introduction to Spark Streaming
    00:49
  - 5.04 Working of Spark Streaming
    00:19
  - 5.05 Streaming Word Count
    01:34
  - 5.06 Micro Batch
    00:19
  - 5.07 DStreams
    00:34
  - 5.08 DStreams (contd.)
    00:38
  - 5.09 Input DStreams and Receivers
    01:19
  - 5.10 Input DStreams and Receivers (contd.)
    00:54
  - 5.11 Basic Sources
    01:14
  - 5.12 Advanced Sources
    00:49
  - 5.13 Transformations on DStreams
    00:15
  - 5.14 Transformations on Dstreams (contd.)
    00:06
  - 5.15 Output Operations on DStreams
    00:29
  - 5.16 Design Patterns for Using ForeachRDD
    01:14
  - 5.17 DataFrame and SQL Operations
    00:25
  - 5.18 DataFrame and SQL Operations (contd.)
    00:20
  - 5.19 Checkpointing
    01:25
  - 5.20 Enabling Checkpointing
    00:39
  - 5.21 Socket Stream
    00:59
  - 5.22 File Stream
    00:11
  - 5.23 Stateful Operations
    00:28
  - 5.24 Window Operations
    01:22
  - 5.25 Types of Window Operations
    00:11
  - 5.26 Types of Window Operations Types (contd.)
    00:05
  - 5.27 Join Operations-Stream-Dataset Joins
    00:20
  - 5.28 Join Operations-Stream-Stream Joins
    00:33
  - 5.29 Monitoring Spark Streaming Application
    01:18
  - 5.30 Performance Tuning-High Level
    00:20
  - 5.31 Demo-Capture and Process the Netcat Data
    00:07
  - 5.32 Capture and Process the Netcat Data
    05:00
  - 5.33 Demo-Capture and Process the Flume Data
    00:07
  - 5.34 Capture and Process the Flume Data
    05:08
  - 5.35 Demo-Capture the Twitter Data
    00:06
  - 5.36 Capture the Twitter Data
    02:33
  - 5.37 Summary
    01:00
  - 5.38 Summary (contd.)
    01:04
  - 5.39 Conclusion
    00:11
- Lesson 06 - Spark ML Programming
  39:49Preview
  - 6.01 Introduction
    00:10
  - 6.02 Objectives
    00:19
  - 6.03 Introduction to Machine Learning
    01:35
  - 6.04 Applications of Machine Learning
    00:21
  - 6.05 Machine Learning in Spark
    00:33
  - 6.06 DataFrames
    00:32
  - 6.07 Transformers and Estimators
    00:59
  - 6.08 Pipeline
    00:48
  - 6.09 Working of a Pipeline
    01:41
  - 6.10 Working of a Pipeline (contd.)
    00:44
  - 6.11 DAG Pipelines
    00:33
  - 6.12 Runtime Checking
    00:20
  - 6.13 Parameter Passing
    00:59
  - 6.14 General Machine Learning Pipeline-Example
    00:05
  - 6.15 Model Selection via Cross-Validation
    01:15
  - 6.16 Supported Types, Algorithms, and Utilities
    00:30
  - 6.17 Data Types
    01:25
  - 6.18 Feature Extraction and Basic Statistics
    00:42
  - 6.19 Clustering
    00:37
  - 6.20 K-Means
    00:55
  - 6.21 K-Means (contd.)
    00:05
  - 6.22 Demo-Perform Clustering Using K-Means
    00:07
  - 6.23 Perform Clustering Using K-Means
    04:41
  - 6.24 Gaussian Mixture
    00:57
  - 6.25 Power Iteration Clustering (PIC)
    01:16
  - 6.26 Latent Dirichlet Allocation (LDA)
    00:34
  - 6.27 Latent Dirichlet Allocation (LDA) (contd.)
    01:45
  - 6.28 Collaborative Filtering
    01:13
  - 6.29 Classification
    00:16
  - 6.30 Classification (contd.)
    00:06
  - 6.31 Regression
    00:41
  - 6.32 Example of Regression
    00:56
  - 6.33 Demo-Perform Classification Using Linear Regression
    00:08
  - 6.34 Perform Classification Using Linear Regression
    02:00
  - 6.35 Demo-Run Linear Regression
    00:06
  - 6.36 Run Linear Regression
    02:14
  - 6.37 Demo-Perform Recommendation Using Collaborative Filtering
    00:05
  - 6.38 Perform Recommendation Using Collaborative Filtering
    02:23
  - 6.39 Demo-Run Recommendation System
    00:06
  - 6.40 Run Recommendation System
    02:45
  - 6.41 Summary
    01:14
  - 6.42 Summary (contd.)
    00:57
  - 6.43 Conclusion
    00:11
- Lesson 07 - Spark GraphX Programming
  28:04Preview
  - 7.01 Introduction
    00:12
  - 7.02 Objectives
    00:17
  - 7.03 Introduction to Graph-Parallel System
    01:13
  - 7.04 Limitations of Graph-Parallel System
    00:49
  - 7.05 Introduction to GraphX
    01:21
  - 7.06 Introduction to GraphX (contd.)
    00:06
  - 7.07 Importing GraphX
    00:10
  - 7.08 The Property Graph
    01:25
  - 7.09 The Property Graph (contd.)
    00:06
  - 7.10 Creating a Graph
    00:13
  - 7.11 Demo-Create a Graph Using GraphX
    00:07
  - 7.12 Create a Graph Using GraphX
    10:08
  - 7.13 Triplet View
    00:30
  - 7.14 Graph Operators
    00:50
  - 7.15 List of Operators
    00:23
  - 7.16 List of Operators (contd.)
    00:05
  - 7.17 Property Operators
    00:18
  - 7.18 Structural Operators
    01:02
  - 7.19 Subgraphs
    00:21
  - 7.20 Join Operators
    01:09
  - 7.21 Demo-Perform Graph Operations Using GraphX
    00:07
  - 7.22 Perform Graph Operations Using GraphX
    05:46
  - 7.23 Demo-Perform Subgraph Operations
    00:07
  - 7.27 Demo-Perform MapReduce Operations
    00:08
  - 7.29 Counting Degree of Vertex
    00:32
  - 7.30 Collecting Neighbors
    00:28
  - 7.36 Conclusion
    00:11

Why Online Bootcamp

Develop skills for real career growthCutting-edge curriculum designed in guidance with industry and academia to develop job-ready skills
Learn from experts active in their field, not out-of-touch trainersLeading practitioners who bring current best practices and case studies to sessions that fit into your work schedule.
Learn by working on real-world problemsCapstone projects involving real world data sets with virtual labs for hands-on learning
Structured guidance ensuring learning never stops24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts

Disclaimer
PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Apache Spark and Scala Certification Training

World’s #1 Online Bootcamp

Course Overview

Training Options

Self-Paced Learning

Corporate Training

Course Curriculum

Course Content

CS-Apache Spark and Scala Certification Training

Lesson 00 - Course Overview

0.01 Introduction

0.02 Course Objectives

0.03 Course Overview

0.04 Target Audience

0.05 Course Prerequisites

0.06 Value to the Professionals

0.07 Value to the Professionals (contd.)

0.08 Value to the Professionals (contd.)

0.09 Lessons Covered

0.10 Conclusion

Lesson 01 - Introduction to Spark

1.01 Introduction

1.02 Objectives

1.03 Need of New Generation Distributed Systems

1.04 Limitations of MapReduce in Hadoop

1.05 Limitations of MapReduce in Hadoop (contd.)

1.06 Batch vs. Real-Time Processing

1.07 Application of Stream Processing

1.08 Application of In-Memory Processing

1.09 Introduction to Apache Spark

1.10 Components of a Spark Project

1.11 History of Spark

1.12 Language Flexibility in Spark

1.13 Spark Execution Architecture

1.14 Automatic Parallelization of Complex Flows

1.15 Automatic Parallelization of Complex Flows-Important Points

1.16 APIs That Match User Goals

1.17 Apache Spark-A Unified Platform of Big Data Apps

1.18 More Benefits of Apache Spark

1.19 Running Spark in Different Modes

1.20 Installing Spark as a Standalone Cluster-Configurations

1.21 Demo-Install Apache Spark

1.22 Demo-Install Apache Spark

1.23 Overview of Spark on a Cluster

1.24 Tasks of Spark on a Cluster

1.25 Companies Using Spark-Use Cases

1.26 Hadoop Ecosystem vs. Apache Spark

1.27 Hadoop Ecosystem vs. Apache Spark (contd.)

1.28 Summary

1.29 Summary (contd.)

1.30 Conclusion

Lesson 02 - Introduction to Programming in Scala

2.01 Introduction

2.02 Objectives

2.03 Introduction to Scala

2.04 Basic Data Types

2.05 Basic Literals

2.06 Basic Literals (contd.)

2.07 Basic Literals (contd.)

2.08 Introduction to Operators

2.09 Use Basic Literals and the Arithmetic Operator

2.10 Demo Use Basic Literals and the Arithmetic Operator

2.11 Use the Logical Operator

2.12 Demo Use the Logical Operator

2.13 Introduction to Type Inference

2.14 Type Inference for Recursive Methods

2.15 Type Inference for Polymorphic Methods and Generic Classes

2.16 Unreliability on Type Inference Mechanism

2.17 Mutable Collection vs. Immutable Collection

2.18 Functions

2.19 Anonymous Functions

2.20 Objects

2.21 Classes

2.22 Use Type Inference, Functions, Anonymous Function, and Class

2.23 Demo Use Type Inference, Functions, Anonymous Function and Class

2.24 Traits as Interfaces

2.25 Traits-Example

2.26 Collections

2.27 Types of Collections

2.28 Types of Collections (contd.)