A lot of data scientists develop, train, and deploy machine learning models in the hosted environment. But unfortunately, they did not have the facility to scale up or scale down resources when required. AWS SageMaker solves this issue by facilitating developers to build and train models to get production faster with minimum effort and lower cost.
In this article on AWS SageMaker, you will get an in-depth understanding of:
And before jumping into SageMaker, here’s a primer on ”What is AWS”.
Amazon Web Services (AWS) is an on-demand cloud platform offered by Amazon, that provides service over the internet. AWS services can be used to build, monitor, and deploy any application type in the cloud. Here's where the AWS SageMaker comes into play.
Amazon SageMaker is a cloud-based machine-learning platform that helps users create, design, train, tune, and deploy machine-learning models in a production-ready hosted environment. The AWS SageMaker comes with a pool of advantages (know all about it in the next section)
Some of the advantages of SageMaker are below:
Now, let’s have a look at the concept of Machine Learning With AWS SageMaker and understand how to build, test, tune, and deploy a model.
The following diagram shows how machine learning works with AWS SageMaker.
Note: S3 is used for storing and recovering data over the internet.
Note: ECR helps a user to save, monitor, and deploy Docker containers.
Note: Suppose you want to predict limited data at a time, use Amazon SageMaker hosting services, but if you're going to get predictions for an entire dataset, use Amazon SageMaker batch transform.
Model training in SageMaker is done on machine learning compute instances.
Note: container images are the ready applications
You can evaluate your model using offline or historical data:
Use historical data to send requests to the model through Jupyter notebook in Amazon SageMaker for evaluation.
It deploys multiple models into the endpoint of Amazon SageMaker and directs live traffic to the model for validation.
Here, a part of the data is set aside, which is called a "holdout set“. Later, the model is trained with remaining input data and generalizes the data based on what it learned initially.
Here, the input data is split into two parts. One part is called k, which is the validation data for testing the model, and the other part is k − 1 which is used as training data. Now, based on the input data, the machine learning models evaluate the final output.
Let’s consider an example of ProQuest
ProQuest is a global information-content and technology company that provides valuable content such as eBooks, newspapers, etc. to the users.
ProQuest used AWS SageMaker to create a content recommendation system. With the help of SageMaker, ProQuest was able to create videos of better user experience and helped in providing maximum relevant search results.
Let us create a SageMaker notebook instance:
Within a few minutes, SageMaker creates a Machine Learning Notebook instance and attaches a storage volume.
Note: This notebook instance has a preconfigured Jupyter notebook server and predefined libraries.
Learn about the AWS architectural principles and services like IAM, VPC, EC2, EBS and more with the AWS Solutions Architect Course. Register today
import libraries
import boto3, re, sys, math, json, os, sagemaker, urllib.request
from sagemaker import get_execution_role
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import Image
from IPython.display import display
from time import gmtime, strftime
from sagemaker.predictor import csv_serializer
# Define IAM role
role = get_execution_role()
prefix = 'sagemaker/DEMO-xgboost-dm'
containers = {'us-west-2': '433757028032.dkr.ecr.us-west-2.amazonaws.com/xgboost:latest',
'us-east-1': '811284229777.dkr.ecr.us-east-1.amazonaws.com/xgboost:latest',
'us-east-2': '825641698319.dkr.ecr.us-east-2.amazonaws.com/xgboost:latest',
'eu-west-1': '685385470294.dkr.ecr.eu-west-1.amazonaws.com/xgboost:latest'} # each region has its XGBoost container
my_region = boto3.session.Session().region_name # set the region of the instance
print("Success - the MySageMakerInstance is in the " + my_region + " region. You will use the " + containers[my_region] + " container for your SageMaker endpoint.")
bucket_name = 'dummydemo' # <--- CHANGE THIS VARIABLE TO A UNIQUE NAME FOR YOUR BUCKET
s3 = boto3.resource('s3')
try:
if my_region == 'us-east-1':
s3.create_bucket(Bucket=bucket_name)
else:
s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={ 'LocationConstraint': my_region })
print('S3 bucket created successfully')
except Exception as e:
print('S3 error: ',e)
With this, we reach the end of this article about the AWS SageMaker.
All clear about AWS SageMaker and its benefits, how Machine Learning works with SageMaker, different ways to train a model, how to validate a model with SageMaker, companies using SageMaker?
Whether you’re an experienced AWS Architect, or you’re aspiring to break into this exciting industry, enrolling in our Cloud Architect Master’s program will help you with all levels of experience in master AWS Cloud Architect techniques and strategies.
Do you have any questions? Please feel free to leave them in the comments section of this article; our experts will get back to you as soon as possible.
Sana Afreen is a Senior Research Analyst at Simplilearn and works on several latest technologies. She holds a degree in B. Tech Computer Science. She has also achieved certification in Advanced SEO. Sana likes to explore new places for their cultures, traditions, and cuisines.
How to Become an AWS Solutions Architect: Certifications Available and Job Opportunities
Cloud Computing Career Guide: A Comprehensive Playbook To Becoming A Cloud Architect
How to Become an Azure Architect
What is Cloud Computing and Who Uses Cloud Services?
How to Become a Cloud Engineer
AWS Career Guide: A Comprehensive Playbook To Becoming an AWS Solution Architect