Monday, May 30, 2011

Elasticness and Clouds

Amazon pretty much claimed the word "elastic" in computing when they delivered their Elastic Compute Cloud (EC2) years ago. One of the key features of this is, unsurprisingly, that it is elastic: you use an API (ideally) or a web interface, to provision resources on demand. 

This works nicely - except when it doesn't. After years of experience with this - I have noticed that (in general) at the first sign of trouble the API will start failing, requests dropping, timing out. (sometimes that can even be an early warning sign of impending doom). 

It is worth noting that most clouds seem to have similar issues regarding APIs  - they often don't have the same quality of service that your servers get. This wouldn't normally be a problem, but a common strategy with infrastructure clouds is to make use of this elasticness (duh !) day to day for your operations, as well as recovery. Frustratingly, due to this behaviour you have to either accept this QoS limitation, or plan around it by consuming extra resources ahead of time. The latter approach somewhat undoes the benefit of having a highly elastic API - but here we are anyway. 

Somewhere, there is a balance, but at the moment, the big users of the public clouds are treating them increasingly as a less than elastic resource pool (look up Netflix and their use of Amazon for an example of this). I can't help but wonder if this means APIs will fall out of favour for highly elastic workloads, or if the QoS of these APIs will improve over time...

Posted via email from Michael's posterous