Introduction to Amazon S3: How Object Storage in the Cloud Works

Subscribe banner

Amazon Simple Storage Service (S3) is a popular cloud storage service part of the Amazon Web Services (AWS). Amazon S3 cloud storage provides high reliability, flexibility, scalability and accessibility. The number of objects and the amount of data stored in Amazon S3 is unlimited. S3 cloud storage is attractive for business considering you pay simply for what you use. However, terminology and methodology may lead to misunderstanding and difficulties for new Amazon S3 users. Where is S3 information stored? How does Amazon S3 storage work? This blog mail explains the main concepts and working principle of Amazon S3 deject storage.

Whatever the platform or service you utilise for information storage, it is recommended that you perform backup regularly. Don't forget to perform AWS EC2 backup. Fill-in to Amazon S3 can be done for all types of data including virtual machines, EC2 instances, databases, and private files. Download NAKIVO Backup & Replication and try to back up your data to Amazon S3 deject storage.

About Amazon S3 Storage

Amazon S3 was the first cloud service from AWS and was launched in 2006. Since then, the popularity of this storage service has been growing. Now Amazon provides a wide listing of other cloud services, but Amazon S3 deject storage is the nearly widely used ane. In add-on to Amazon S3 storage, AWS offers Amazon EBS volumes for EC2 and Amazon Drive. But the three services accept different uses and purposes.

EBS (Elastic Block Storage) volumes for EC2 (Elastic Compute Cloud) instances are virtual disks for virtual machines residing in the Amazon deject. Equally you can understand from the EBS name, this is a block storage in the cloud that is the analog of hard disk drives in physical computers. An operating system tin exist installed on an EBS volume fastened to an EC2 instance.

Amazon Bulldoze (formerly known as Amazon Cloud Bulldoze) is the analog of Google Drive and Microsoft OneDrive. Amazon Drive has a smaller range of features than Amazon S3. Amazon Bulldoze is positioned as a storage service in the cloud to dorsum upward photos and other user information.

Amazon S3 cloud storage is an object-based storage service. Yous cannot install an operating organisation when you utilize Amazon S3 storage because data cannot be accessed on the cake level as information technology is required by an operating system. If you lot need to mount Amazon S3 storage equally a network drive to your operating arrangement, use a file arrangement in userspace. Read the blog mail nigh mounting S3 cloud storage to different operating systems. Google Cloud is the analog of Amazon S3 cloud storage.

Amazon S3 Main Concepts

If you are going to use Amazon S3 for the commencement time, some concepts may exist unusual and unfamiliar for you. Methodology of storing information in S3 cloud is different from storing data on traditional hard disk drive drives, solid state drives or disk arrays. Below is an overview of the principal concepts and technologies used to store and manage data in Amazon S3 cloud storage.

How does S3 shop files?

Equally explained in a higher place, information in Amazon S3 is stored as objects. This approach provides highly scalable storage in the cloud. Objects tin can be located on dissimilar physical disk drives distributed across a datacenter. Special hardware, software, and distributed file systems are used in Amazon datacenters to provide loftier scalability. Back-up and versioning are features implemented by using the block storage approach. When a file is stored in Amazon S3 as an object, it is stored in multiple places (such every bit on disks, in datacenters or availability zones) simultaneously by default. Amazon S3 service regularly checks information consistency by checking control hash sums. If information abuse is detected, the object is recovered by using the redundant data. Objects are stored in Amazon S3 buckets. By default, objects in Amazon S3 storage can be accessed and managed via the spider web interface.

What is S3 object storage?

Object storage is a blazon of storage where data is stored every bit objects rather than blocks. This concept is useful for data backup, archiving, and scalability for high-load environments.

Objects are the fundamental entities of data storage in Amazon S3 buckets. There are three principal components of an object – the content of the object (data stored in the object such as a file or directory), the unique object identifier (ID), and metadata. Metadata is stored as key-pair values and contains information such every bit proper name, size, engagement, security attributes, content type, and URL. Each object has an admission control listing (ACL) to configure who is permitted to access the object. Amazon S3 object storage allows yous to avert network bottlenecks during rush hr when traffic to your objects stored on S3 cloud storage increases significantly. Amazon provides flexible network bandwidth just charges for network access to the stored objects. Object storage is good when a high number of clients must access the data (high read frequency). Search through metadata is faster for the object storage model. Read besides about Amazon S3 encryption that can assist you protect data stored in Amazon S3 cloud storage and enhance security.Amazon S3 storage

Buckets

A bucket is a fundamental logical container where data is stored in Amazon S3 storage. Y'all can store an infinite amount of information and unlimited number of objects in a bucket. Each S3 object is stored in a saucepan. In that location is a five TB limitation for the size of ane object stored in a bucket. Buckets are used to organize namespace at the highest level and are used for access control.

Keys

An object has a unique primal after it has been uploaded to a bucket. This key is a cord that imitates a bureaucracy of directories. Knowing the key allows you to access the object in the saucepan. A bucket, cardinal, and version ID identify an object uniquely. For example, if a bucket name is blog-bucket01, the region where datacenters store your data is located is s3-eu-west-one and the object proper name is test1.txt (a text file), the URL to the needed file stored every bit the object in the bucket is:

https://blog-bucket01.s3-eu-west-ane.amazonaws.com/test1.txt

Permissions must be configured by editing object attributes if you desire to share objects with other users. Similarly, y'all can create a TextFiles folder and store the text file in that binder:

https://blog-bucket01.s3-eu-west-ane.amazonaws.com/TextFiles/test1.txt

At that place are two types of the URL that tin be used:

  • bucketname.s3.amazonaws.com/objectname
  • s3.amazon.aws.com/bucketname/objectname

AWS Regions

Amazon has datacenters in different regions across the globe including the USA, Ireland, South Africa, Republic of india, Japan, China, Korea, Canada, Germany, Italy, and Great Britain. Y'all can select the region you want when creating a saucepan. It is recommended that you select a region that is closest to you or to your customers to provide lower latency for a network connection or minimize costs (because the toll for storing data is different depending on the region). Data stored in a certain AWS region never leaves the datacenters of that region until you migrate the data manually. AWS Regions are isolated from each other to provide error tolerance and stability.

Each region contains Availability Zones that are isolated locations within an AWS Region. At to the lowest degree three Availability Zones are available for each region to prevent failures caused by disasters such as fires, typhoons, hurricanes, floods, and so on.

The Information Consistency Model

The read-subsequently-write consistency bank check is performed for objects stored in Amazon S3 storage. Amazon S3 replicates information across servers and datacenters within a selected region to attain high availability. After a successful PUT request, the changed data must be replicated across the servers. This procedure tin can take some time. A user can get the old data or updated data in this case, but non the corrupted data. This is also truthful for deleted objects and buckets. Object locking is not performed when new objects are sent to S3 buckets. The latest PUT asking wins if multiple PUT requests are performed simultaneously. You can create your ain application with a locking mechanism that works with objects stored in Amazon S3 storage.

Amazon S3 Features

The concept of object-based storage allows Amazon to provide useful features and high flexibility for storing information in Amazon S3 storage and management. Let'south review these features.

Versioning

Object versioning allows you lot to shop multiple versions of an object in one bucket. This characteristic can protect objects stored in Amazon S3 storage confronting unintended editing, overwrites, or deletions. Afterward irresolute or deleting an object yous can restore i of the previous versions of that object. Versioning is implemented due to using the object storage approach. You can use versioning for archival purposes. Versioning is disabled by default.

A version ID is assigned to each S3 object even if versioning is not enabled (in this example the version ID value is ready to nothing). If versioning is enabled, a new version ID value is assigned to a new version of the object after writing changes. Versioning can be enabled on the bucket level. The version ID value of the first version of the object remains the aforementioned. When you delete an object from an S3 bucket (with versioning enabled), the delete mark is applied to the latest version of the object.

Versioning in Amazon S3 storage

Storage classes

Amazon S3 storage classes define the purpose of the storage selected for keeping information. A storage course can be set up at the object level. However, yous can gear up the default storage grade for objects that will be created at the saucepan level.

S3 Standard is the default storage class. This class is hot information storage and is good for oft used information. Employ the Standard storage course to host websites, distribute content, develop cloud applications, and and so on. High storage costs, low restore costs, and fast access to the information are the features of this storage grade.

S3 Standard-IA (infrequent access) tin can be used to store data that is accessed less frequently than in S3 Standard. S3 Standard-IA is optimized for a longer storage duration period. There is a charge for retrieving data stored in the S3 Standard-IA storage course. Additionally, in both S3 Standard and S3 Standard-IA you have to pay for data requests (PUT, Re-create, Mail, Listing, Become, SELECT).

S3 One Zone-IA is designed for infrequently accessed data. Information is stored only in i availability zone (data is stored in three availability zones for S3 Standard) and as a event, a lower redundancy and resiliency level is provided. The declared level of availability is 99.5%, which is lower than that of the other two storage classes. S3 1 Zone-IA has lower storage costs, college restore costs, and y'all accept to pay for data retrieval on a per-GB basis. You tin can consider using this storage grade as cost effective to store backup copies or copies of data fabricated with Amazon S3 cross-region replication.

S3 Glacier doesn't offering instant access to stored data, different the other storage classes. S3 Glacier tin can be used to store data for long term archival at a low cost. There is no guarantee for uninterrupted performance. You need to wait from a few minutes to a few hours to retrieve the data. Y'all tin can transfer old data from storage of a college class (for instance, from S3 Standard) to S3 Glacier past using S3 lifecycle policies and reduce storage costs.

S3 Glacier Deep Archive is similar to S3 Glacier, only the time needed to call back the data is nigh 12 hours to 48 hours. The toll is lower than the price for S3 Glacier. The S3 Glacier Deep Annal storage course can be used to store backups and archival data of companies that follow regulatory requirements for data archival (fiscal, healthcare). This is a good alternative to record cartridges.

S3 Intelligent-Tiering is a special storage class that uses other storage classes. S3 Intelligent-Tiering is intended to automatically select a better storage class to shop information when you don't know how frequently you lot will demand to admission this information. Amazon S3 can monitor patterns of accessing data when using S3 Intelligent-Tiering, and then store the objects in one of the two selected storage classes (ane which is for frequently accessed data and another is for rarely accessed information). This approach gives you optimal toll-effectiveness without compromising on performance. For instance, if you access an object stored in a storage grade for infrequently accessed data, this object is moved automatically to a storage class for frequently accessed data. Otherwise, if an object was not accessed for a long time, the object is moved to the storage class for unfrequently used information. Objects tin be located in the same bucket and storage grade is inverse at the S3 object level.

Storage classes in Amazon S3 cloud storage

Access control lists

An access control list (ACL) is a characteristic used to manage and control access to objects and buckets. Admission control lists are resource-based policies that are attached to each bucket and object to define users and groups that take permissions to access the bucket and object. Past default, the resource owner has total access to a bucket or object after creating the resource. Saucepan access permissions define who can access objects in the bucket. Object access permissions define users who are permitted to admission objects and the access type. You tin can fix read-merely permissions for i user and read-write permissions for another user, for example.

The complete list of users who can take permissions (a user who has permissions is called grantee):

Owner – a user who creates a bucket/object.

Authenticated Users – any users who accept an AWS business relationship.

All Users – whatever users including anonymous users (users who don't accept an AWS business relationship).

User past E-mail/Id – a specified user who has an AWS business relationship. The email or AWS ID of a user must be specified to grant access to this user.

Available types of permissions:

Full Control – this permission type provides Read, Write, Read (ACP), and Write ACP permissions.

Read – allows to listing the bucket content when applied on the bucket level. Allows to read the object data and metadata when applied on the object level.

Write – tin be applied merely at the bucket level and allows to create, delete, and overwrite any object in the saucepan.

Read Permissions (READ ACP) – a user can read permissions for the specified object or bucket.

Write Permissions (WRITE ACP) – a user tin overwrite permissions for the specified object or saucepan. Enabling this permission blazon for a user is equal to setting Total Control permissions because the user tin can ready any permissions for his/her business relationship. This permission is available for the bucket owner by default.

S3 storage AWS - permissions and access control lists

Bucket policies

Bucket policies are resource-based AWS identities and access management policies that are used to create provisional rules for granting access permissions to AWS accounts and users when accessing buckets and objects in buckets. You can utilise bucket policies to define security rules for more than 1 object in a bucket.

The bucket policy is defined as a JSON file. The bucket policy configuration text must meet JSON format requirements to be valid. The saucepan policy can be fastened only on a saucepan level and is inherited to all objects in the bucket. You can grant access for users who are connecting from specified IP addresses, users of specified AWS accounts, and then on.

Below you can see the example of a policy that grants full access to all users of ane business relationship and read-only access to every user of some other account.

{

  "Statement": [

    {

      "Effect": "Allow",

      "Principal": {

        "SGWS": "95381782731015222837"

      },

      "Action": "s3:*",

      "Resource": [

        "urn:sgws:s3:::blog-bucket01",

        "urn:sgws:s3:::blog-bucket01/*"

      ]

    },

    {

      "Effect": "Permit",

      "Principal": {

        "SGWS": "30284200178239526177"

      },

      "Action": "s3:GetObject",

      "Resource": "urn:sgws:s3:::blog-bucket01/shared/*"

    },

    {

      "Result": "Permit",

      "Chief": {

        "SGWS": "30284200178239526177"

      },

      "Action": "s3:ListBucket",

      "Resource": "urn:sgws:s3:::blog-bucket01",

      "Status": {

        "StringLike": {

          "s3:prefix": "shared/*"

        }

      }

    }

  ]

}

Users can admission Amazon S3 storage by using access keys (Access Cardinal ID and Secret Access Key) without entering the username and password. This approach allows you to enhance security and is used to create applications that utilize APIs to access Amazon S3 cloud storage.

APIs for Amazon S3

Amazon provides application programming interfaces (APIs) to access S3 functionality and develop own applications that must work with Amazon S3 storage. There are REST and SOAP interfaces provided by Amazon. The REST interface uses standard HTTP requests to work with buckets and objects. Standard HTTP headers are used past the REST API. The SOAP interface is some other bachelor interface. Using Soap over HTTP is deprecated but yous still tin can employ Lather over HTTPS.

The Paying Model

Amazon S3 provided the "pay only for what yous use" model. Minimum fee is not required – you lot don't demand to pay for a predetermined amount of storage and network traffic. At that place are usage categories for what you must pay:

Storage. Pay for objects stored in Amazon S3. The amount of coin you have to pay depends on the used storage space, the time of storing objects in Amazon S3 storage (during the calendar month), and the storage grade used by stored objects.

Requests and data retrieval. You must pay for requests fabricated to retrieve data stored in Amazon S3 cloud storage.

Data transfer. You must pay for all used bandwidth (inbound and outbound traffic) except incoming data from the internet; approachable data that is transferred to Amazon EC2 instances that are located in the same AWS Region every bit the source S3 saucepan; information outgoing from an S3 bucket to CloudFront.

Direction and replication. You must pay for using storage management features similar analytics and object tagging. Amazon charges for cross-region replication and aforementioned-region replication.

Use Amazon S3 estimator to estimate your payments.

Backup to Amazon S3

Determination

Amazon S3 cloud storage is a skillful solution for high-load platforms that require a high level of reliability, scalability and accessibility. However, Amazon S3 storage is also affordable for individual users. This web log mail service has explained how Amazon S3 works, its features, how data is stored is S3, and how y'all can use this service. Amazon S3 is the object-based storage that provides versioning, data redundancy, access management and security options for data direction.