All You Need to Know About Amazon S3
All You Need to Know About Amazon S3
Amazon S3 is a cloud storage system offered by Amazon Web Services (AWS). In this article and video, we will learn what Amazon S3 is, why it is useful and the benefits of using it.
Introduction to Amazon S3
One of the biggest challenges during my IT career has been managing the storage requirements for the applications I supported. Storage space was always at a premium and maintaining sufficient quantities of backups was a logistical nightmare. This lesson shows you how Amazon S3 changes everything. It offers a variety of practically unlimited and cost-effective storage which is available on demand with unprecedented levels of durability and availability. Not only that, Amazon S3 also provides a way to easily host static websites and distribute your content around the world to your end-users with low latency.
What is Amazon Simple Storage Service?
Amazon Simple Storage Service (or S3 for short) provides developers and IT teams with secure, durable and highly scalable cloud storage. Basically, it is file storage in the cloud.
Benefits of Amazon S3
High Durability: It is extremely durable and it provides an 11 9s (99.999999999%) durability. Your data is redundantly stored across multiple facilities and multiple devices within each facility. We touched on 11 9s durability in an earlier lesson, but that basically means you are going to lose 1 file every 10 million years with the 11 9s durability.
High Availability: It is also highly available. Amazon S3 is designed for 99.99% availability. You can also chose the AWS region in which to store your data, so you can optimize latency, minimize your storage costs and also address regulatory compliance. So if you have sovereign data that needs to be in a specific country, you can chose an Amazon region which satisfies that need.
Cost Efficient: It is also very cost efficient. You can store huge amounts of data at a very low cost and you only pay for what you use and are charged by gigabyte per month usage. There is a variety of different storage classes available, meaning you can categorize your data and pay only what you need to.
Secure: It is also very secure. Amazon S3 supports SSL data transfer and data encryption once it has been uploaded. You can also control access to your data using IAM and also specify object permissions using the S3 policies.
Scalable: Amazon S3 is also highly scalable. It allows you to store as much or as little data as you want. The storage is elastic, so you can scale up and down as required and only pay for what you are using. You can also configure notifications to be sent when objects are loaded into Amazon S3 via SQS, SNS or even Lambda. This way you can set workflows for your files.
High Performance: Amazon S3 is also highly performant. You can use multipart uploads to maximize network throughput and resilience. Amazon S3 transfer acceleration is a new service that allows you to make use of edge locations to increase upload and download times. Amazon S3 is also fully integrated with many AWS products such as CloudFront, CloudWatch, RDS, EBS, Lambda, etc.
Easy to Use: It is also really easy to use and has multiple connectivity options. You can use the web console, the Amazon web services command line interface, as a mobile application, or you can use APIs or the software development kit.
Functions of Amazon S3
Backup and Archiving: Amazon S3 is ideal for this purpose. You can store practically unlimited amount of data when you need it. Traditional IT infrastructure would offer a finite storage capacity. You would have to manage what backups and archives you are going to retain, but with S3 that problem disappears. You can retain as many backups as you want for a low cost.
Accessibility: Amazon S3 is object based storage, thus accessible via a Web interface. You can store and retrieve data from anywhere on the Web. So your users all around the world can have access to your files easily.
Content, storage and distribution: It is great for this. You can upload your entire storage infrastructure into the cloud to minimize your cost, and you can also distribute your content directly from S3 to end-users or use S3 as a source for delivering content to Amazon Cloud from edge locations, so you can provide fast access to your information.
Big Data: It is also great for big data. Amazon S3 is designed to be used as a big data object store for things like photos, videos, financial data, etc. Using AWS products and services, you can also perform big data analytics.
Host Static Websites: You can also use S3 to host static websites. It allows you to host your entire static website for a low cost and it makes for a highly available hosting solution.
Disaster Recovery: Another key feature of Amazon S3 is for disaster recovery, as it offers a robust disaster recovery solution. All data stored on S3 is automatically replicated to a different availability zone and you also have the option to copy it to other regions using cross-region replication. You can also add further recovery options by storing multiple versions of an object for point in time recovery.
All Amazon S3 data is stored in something called buckets.
What is a Bucket?
A bucket is basically a folder that you can read, write and delete objects from. You can store as many objects as you want in a bucket, but objects are limited in size to 5 terabytes and the largest put operation is 5 gigabytes. So, if you are uploading very very large files, you need to break them into smaller chunks.
Benefits of Buckets
Buckets allow you to create granular security, so you can control who has created the lead and retrieve permissions to each of your buckets and the files inside them. You can also control who has access to bucket logs, which are used to store information about access to your files and objects, and you can also choose the region where a bucket is stored.
How do You Create a Bucket?
Well, the easiest way to create a bucket is using the web console or the command line interface. The bucket name you chose must be unique across all bucket names in Amazon S3. Amazon recommends that one way to help ensure uniqueness is to prefix your bucket names with the name of your organization.
Characteristics of Bucket Labels
Bucket names must be at least 3 characters and no more than 63 characters long.
Bucket names must be a series of one or more labels
Amazon recommends separating labels with a single period.
Bucket names can contain lower case letters, numbers and hyphens.
Each label must start and end with a lower case letter or a number.
Here are some examples of valid bucket names. For example, myawsbucket, my.aws.bucket or myawsbucket.1.
By default, you can create up to a 100 buckets in each of your AWS accounts. If you need to increase this, you have to contact AWS support. Bucket name ownership is not transferrable. However, if a bucket is empty, you can delete it and the name will eventually become available for use again. There is no limit to the number of objects that can be stored in a bucket and there is no difference in performance whether you use many buckets or just have a couple, and you also cannot create a bucket within another bucket.
Types of Storage Classes
Amazon S3 comes in a range of storage classes, for different data categories.
Amazon S3 Standard
Amazon S3 Standard - Infrequent Access
Amazon S3 Reduced Redundancy Storage and
Amazon S3 Standard: Standard is designed for high availability and durability, and is used to store frequently accessed data. It is designed for 11 9s durability and 99.99% availability. It is perfect for low latency and high throughput environments. Amazon S3 Standard is typically used for things like dynamic websites, cloud applications, mobile applications or even just regular file storage. It is great for your business to use or end-users to upload photos and videos, or just be able to access your files and objects in real time.
Amazon S3 Standard - Infrequent Access: This is designed for objects that are accessed less frequently but when they are required, you need them rapidly. Standard Infrequent Access offers the same durability, throughput and low latency of Standard and it offers a lower cost per gigabyte, but it does have a per gigabyte retrieval fee. Standard - Infrequent access is typically used with data that you do not need very often, but when you do need it, you need it very quickly. Examples of this would be last month's reporting data or database backups that were taken earlier this month.
Amazon S3 Reduced Redundancy Storage: It is a storage option that enables you to reduce your cost by storing non-critical reproducible data at lower levels of redundancy than Amazon's existing S3 standard storage. So it is designed for noncritical objects or reproducible objects so that you can cope with a lower durability and lower availability. Reduced Redundancy Storage provides a cost-effective solution for distributing or sharing data that is being durably stored elsewhere, and it is great for storing thumbnails, transcoded media or other processed data that can be easily reproduced. So, if you have a website application that people upload photos to and then you re-create a thumbnail and store that, you can re-create that whenever you want. So this class of storage is perfect for storing things like the thumbnails.
Amazon Glacier: This is used to archive your data that you rarely access and you can cope with a retrieval time of several hours. It still offers a durability of 11 9s and it has the lowest cost available in Amazon storage. It also has a secure vault lock feature, so you can really secure your old archives and keep them for when you need them. Typically used cases would be to store database backups for months or even years ago, or compliance data that you think you just might need several years down the line. I note that Amazon Glacier would have been great in some of my previous employments, where we have limited onsite storage capacity and we could only retain a certain amount of database backups. So, when someone came along and said, “Hey, we urgently need a database backup from 5 years ago”, there was absolutely no way I could have provided it to them. So, the Glacier was definitely an option.
The table below just has a quick overview of the different storage classes and the differences between them.
Things to Remember
Now, things that you may see coming up in the exam are the durabilities and the availabilities, so it is worth remembering the percentages for each of the classes. Also, here it mentions the latency times as well, so Standard, Standard IA and RRS have milliseconds response times, whereas Glacier has 4 to 5 hours response times.
You can easily host static websites in S3 buckets. To do this, you need to configure your bucket as static website hosting then you upload your website code into your bucket. It is then accessible from the following URL:
Now it is worth remembering because it may well come up in the exam, so just remember the convention, but our simplilearn bucket that sits in the US East 1 region would have a URL of https://simplilearn.s3-website-us-east-1.amazonaws.com.
Now, you can also your own domains to point your S3 buckets..
You can also provide URL access to the objects in your bucket. So if you did enable website hosting, this first URL - http://examplebucket.s3-website-us-east-1.amazonaws.com/photo.jpg, would provide you with the photo.jpg object, so it is effectively the same web address we had earlier with /photo.jpg added, but you can provide URL access without enabling static website hosting as long as you setup the permissions appropriately. So this URL here - https://s3.amazonaws.com/simplilearn/health_check.html, requests the health_check.html object that we stored in our simplilearn bucket, and you can see here it is https://s3.amazonaws.com, forward slash, then the name of our bucket then the name of the file.
To view the Amazon S3 bucket demonstration, where we are going to create a new bucket please see the Youtube video starting at the timeline 13:20.
Hope you liked this detailed article and video on Amazon S3!
Now that you know all about the Amazon S3, take the AWS DevOps Architect master course and become a certified AWS professional.
About the On-Demand Webinar
About the Webinar