We love Amazon Web Services, we've been using EC2 and the ever increasing number of services provided by Amazon since way back in 2008, and depend on their awesome set of services to deliver TextIt and RapidPro hosting to our customers.
But one thing with AWS is that it isn't always clear what the most economically performant strategy is. Should you buy reserved instances? Light or heavy? What size instance should you get? What kind of storage on that instance? Even once you figure out the lay of the land, things are often changing, so the right choice six months ago may not be anymore. That's what happened to us.
We use Amazon's Relational Database Service (RDS) to host PostgreSQL. The ease in setting up a multi-AZ instance that will failover automatically, always be on the latest version and have bullet proof backups is well worth the added cost over managing those databases ourselves. When we first started scaling, we ponied up to reserve some large instances, and at Amazon's recommendation, also decided to buy provisioned IOPS to guarantee the performance of the disk on those machines.
Last week, we realized that was a mistake, both from a performance and a budget point of view.
See, provisioned IOPS don't come cheap, on a multi-AZ deployment, Amazon charges 20c per IOPS per month, with a minimum of 1,000 IOPS. So you are spending a minimum of $200/mo to guarantee your database can read and write at that rate. If you have say, a 100 GB database (also the minimum) then you are spending $20/mo on storage and $200/mo to guarantee performant access to it, that's pretty crazy.
Turns out that if your load tends to be peaky, you now have a much better option, using general purpose SSD drives. Amazon covers the specifics in their nitty gritty documentation, but the gist is essentially that for every GB of storage you buy, you get 3 free base IOPS, so for 100 GBs of space you get 300 "provisioned" IOPS for free.
Now obviously that might not be good enough for your database load, but Amazon has a neat trick in that they will credit you for whenever your DB is using fewer IOPS than your guaranteed rate and let you burst above it. This combined with the much more reasonable price, .23c per GB for multi-AZ, means you can splurge on your total disk size and get both some great savings and better performance than you would using IOPS. (oh and more disk!)
In our case, we decided to upgrade to 250 gigs of space and switch to a general purpose SSD drive. That gave us base IOPS of 750, only slightly less than our previous max of 1,000 IOPS, but now with the ability to burst to 3,000 IOPS for up to 40 minutes at a time. Not only that, but our bill for storage on our instance went from $220/mo to less than $60/mo, even though we now have two and half times the capacity.
For our usage patterns, which involve long periods of very little happening punctuated by very heavy writes and reads, this is perfect, as we now can burst far faster than we could before and have plenty of time to recover credits for the next event. A quick graph of our RDS IOPS makes that clear.
So check your logs and your current usage, you might be able to save yourself a bundle by moving away from provisioned IOPS storage and instead using SSD storage for your RDS instances.