Monday, November 12, 2012

Amazon Glacier – Digging Into Peak Hourly Retrieval Rate

As the Amazon Glacier FAQ clearly states, it is designed for infrequent retrieval. In other words, excessive retrieval will make a clear dent in your monthly bill – and this comes in the form of something called Peak Hourly Retrieval rate. There are other components related to the Glacier retrieval policy, but they are straight forward and in line with other Amazon Services.

Let us say you have 250 GB of data stored in Glacier and we have a month with 30 days. You get to retrieve 5% of your data at a monthly basis for free, which in a 30 day month is about 0.17% or 0.42 GB per day or 17.8 MB per hour. Anything more than this will be added to your bill based on your Peak Hourly Retrieval rate.

Let us start by investigating the rate you would get if you had a dumb or generic backup client. You want to restore your 250GB of data, and the backup client will request this data from Glacier. If I understand Glacier correctly, the minimum processing time would be 4 hours. So let us use this number – leaving you with an hourly retrieval rate of 62.5 GB. As previously mentioned you would get about 17.8 MB for free, which won’t matter much at this volume so let us ignore it.

With a peak hourly retrieval rate at 62.5 GB, this hour will cost you 1 cent per GB, leaving you at 62.5 cent. The bad news is that this rate, or rather your maximum rate, will be the basis for all hours within the active month. In our scenario, this is 30 days or 720 hours leaving us with a line item of 450 dollars. Obviously, this will get a lot more expensive if you dataset was larger.

One important note before we move on – retrieval rate is not the same as download rate. If you request 250 GB at once and it takes 4 hours, you will be charged for 62.5 GB regardless of whether it takes you 3 weeks to download!

A smart backup client would be able to give you options more in line with your needs and more importantly within your download speed. There is little sense in pulling out 62.5 GB per hour if your connection can’t handle it – and I guess it won’t for most of us. If you set your backup client to space this evenly out for 24 hours, you would be looking at around 10.5 GB per hour, which should be possible on a fast broadband connection. Ignoring your free hourly usage, you would pay around 75 dollars for this. Or, if you have plenty of time you could set it for a week and retrieve 1.5 GB per hour, leaving you with a fairly manageable 11 dollars.

Another thing to note is that if you are retrieving small amounts of data relative to your total storage you would able to keep it within your free allowance – or achieve the same with a complete restore if you just have enough time.

I guess the moral of the story is, Glacier is great for storing a lot of data – but you should get a backup client that understands the Glacier billing model and can give you sufficient information to let make the right decisions and avoid a huge charge on your credit card. If such a client is not available to you, you should at least try to manually restore at reasonable chunks – but this is hardly a comfortable restore scenario.

I would also like to stress that in addition to your peak hourly usage fee you would also be looking at a transfer fee and a requests fee. The transfer fee for 250GB is 12 cent per gigabyte, with 1 GB free every month. This leaves you with about 30 dollars that you need to add to your bill. I would assume your requests charge would be less than 5 dollars at this volume, maybe less than 1 dollar. If you have a look at how much your initial transfer in was, it should get you workable number.

In short, if you get your restore partitioned out for 24 hours, you would pay around 110 dollars for 250GB – which should be within reach in the event of total data loss.

No comments:

Post a Comment