To Get Cloud Economics Right, Think Small, Very, Very Small
A startup, who wishes to remain anonymous, is delivering an innovative new business service from an IaaS cloud and most of the time pays next to nothing to do this. This isn't a story about pennies per virtual server per hour – sure they take advantage of that- but more a nuance of cloud optimization any enterprise can follow: reverse capacity planning.
Most of us are familiar with the black art of capacity planning. You take an application, simulate load against it, trying to approximate the amount of traffic it will face in production, then provision the resources to accommodate this load. With web applications we tend to capacity plan against expected peak, which is very hard to estimate – even if historical data exists. You capacity plan to peak because you don't want to be overloaded and cause the client to wait, error out, or go to your competition for your service.
And if you can't accommodate a peak you don't know how far above the capacity you provisioned the spike would have gone. If you built out the site to accommodate 10,000 simultaneous connections and you went above capacity for five minutes, was the spike the 10,001st connection or an additional 10,000 connections?
But of course we can never truly get capacity planning right because we don't actually know what peak will be and if we did, we probably don't have the budget (or data center capacity) to build it out to this degree.
The downside of traditional capacity planning is that while we build out the capacity for a big spike, that spike rarely comes. And when there isn't a spike a good portion of those allocated resources sit unused. Try selling this model to your CFO. That's fun.
Cloud computing platforms – platform as a service (PaaS) and infrastructure as a service (IaaS) offerings to be specific – provide a way to change the capacity planning model. For scale-out applications you can start with a small application footprint and incrementally adjust the number of instances deployed as load patterns change. This elastic compute model is a core tenet of cloud computing platforms, and most developers and IT infrastructure and operations professionals get this value. But are you putting it into practice?
To get the most value from the cloud, you should use capacity planning completely opposite of the traditional way because the objective on the cloud is to get the application to have as small a base footprint as possible. You don't preallocate resources for peak – that's the cloud's job. You plan for the minimum hourly bill because you know you can call up more resources when you need them providing you have the appropriate monitoring regime and can provision additional IaaS resources fast enough.
In the case study we published on GSA's use of Terremark's The Enterprise Cloud, this is the approach it took for USA.gov. It determined the minimum footprint the site required and placed this capacity into traditional Terremark hosting under a negotiated 12-month contract. It then tapped into The Enterprise Cloud for capacity as traffic loads went up from here. The result: a 90% reduction in hosting costs.
You can do the same but don't stop once you've determined your base capacity. Can you make it even smaller? The answer is most likely yes. Because if there is a peak, then there is also a trough.
That's what the startup above has done. They tweaked their code to get this footprint even smaller, then determined that the minimum footprint they really needed persistently was their services' home page, and that could be cached in a content delivery network. So when their service has no traffic at all, they have no instances running on their IaaS platform. Once the cache has been hit they then fire up the minimal footprint (which they tuned down to a single virtual machine) and then scale up from here.
Now that's leveraging cloud economics. If you can find the trough then you can game the system to your advantage. What are you doing to maximize your cloud footprint? Share your best tips here and we'll collect them for a future blog post and the forthcoming update to our IaaS Best Practices report.
Mike Gualtieri contributed to this report.