This is the second post in my AWS Storage Series. In this blog post, we will talk about AWS Glacier. AWS Glacier is a secure, durable and extremely low-cost storage service for data archiving and long-term backup. The best part about Glacier is its cost, you can archive data in Glacier for as low as $0.004 per GB per month. In addition to the cost, it comes with all the benefits of a managed service. You don’t have to worry about hardware provisioning, capacity planning etc. AWS takes care of all these things.
You can store any type of data in Glacier, including photos, videos, or documents. Glacier uses the concept of archives when you store your data. For instance, you can compress all your data in a zip file and then upload the zip file as an Archive in Glacier. Individual archives can be as large as 40TB in size. Each archive is assigned a unique archive id, and you can use this ID to retrieve data at any point. A collection of archives is called a Vault. Vaults are used to organize your data and you can have up to 1000 vaults per region.
You can secure all the data stored in AWS Glacier by using Identity and Access Management(IAM) policies. You can give access to a specific group of users using vault level access policies. In addition to restricting user access, all the data that you upload in Glacier is encrypted on the server side. AWS will handle key management and key protection for you. All the data is encrypted using the 256-bit Advanced Encryption Standard (AES-256) block cipher. Another advantage of using Glacier is that the archives are immutable. You can only create an archive, but you cannot update it. This along with the integration with AWS CloudTrail enables you to monitor if and when a particular archive was accessed and by whom.
In terms of durability, Glacier follows the same guidelines as S3. All the data stored in Glacier is replicated to multiple devices in multiple facilities before you get a successful write confirmation. By doing this, AWS provides a durability of 99.999999999%, which is the same as Amazon S3 Standard. As mentioned in my previous blog post, Glacier integrates with S3 lifecycle rules. You can have lifecycle policies that move your data from S3-IA to Glacier or even directly from S3-Standard to Glacier. Data imported into Glacier using S3 lifecycle policies can only be seen using the S3 management console or S3 APIs and not through the Glacier management console.
Since Glacier is a data archival service, you will have to pay AWS for retrieving your data back from Glacier. When you submit a retrieval request, it copies the data into S3-RRS buckets and makes it available for a limited amount of time. There are three options for data retrieval from AWS Glacier.
- Standard Retrievals: This allows you to access any archive within 3-5 hours. You are charged a flat fee of $0.01 per GB and $0.05 per 1000 requests.
- Bulk Retrievals: This is the lowest cost retrieval option. You can access your data within 5-12 hours. Bulk retrievals are charged a flat fee of $0.0025 per GB and $0.025 per 1000 requests.
- Expedited Retrievals: This is the quickest form of data retrieval from Glacier. You can access your data within 1-5 mins. You are charged a flat fee of $0.03 per GB and $0.01 per request.
But, if you never retrieve your data, you still have to pay Amazon for uploading and storing your data. You are charged $0.004 per GB per month for Data storage and $0.05 per 1000 upload requests. So overall Glacier is a highly cost-efficient data archival service. Glacier also gives you the flexibility to retrieve an entire archive or just a selected range from an archive, So keep that in mind.
Next, let’s talk about another feature that Glacier offers which is called the Vault Lock. Vault Lock allows you to enforce compliance requirements, for instance. Hospitals are required to store all the patient data for a period of 7 years. In addition to the capability of store data for long periods of time, you can also enforce controls such as “Write Once Read Many” (WORM) in a vault lock policy. A vault lock policy is different than a vault access policy. Vault lock policies stop users from changing archives in your vault and Vault access policies prevent user access to your vault. Using vault access policies you can also enable cross-account access to your vaults. You can use both these policies together to define who can access your vaults and which vaults cannot be deleted.
If you want to learn more about AWS Glacier, you can use the following links:
And if you want to read my previous blog on AWS S3, use the following link: