No, scalability may not be rocket science but it is computer science and not nearly as easy as it might appear
In what might be considered an ironic statement, scalability in cloud computing environments is as much about decreasing capacity as it is increasing capacity.
I know, puts my knickers in a twist, too.
The description of “scalability” associated with cloud computing in almost every definition that’s put forth1, however, clearly indicates the need for elastic scalability and it is that modifier that makes all the difference in the world.
See, in the past we’ve just been concerned with managing growth, with addressing the need to increase capacity to match an increase in usage. It may have been slow and steady or explosive and instantaneous, but it was always about an increase in usage. We never really considered how to deal with a decrease and we certainly didn’t take away capacity once we’d allocated it.
Cloud computing, however, does assume that the latter is something we will – and should – do.
Cloud isn’t just about “pay for what you use” it’s also about “use only what you need” and thus transitive logic tells us2 it should be “pay for what you need”. It’s about efficiency of utilization as much as it is costs. And efficiency means use what you need, when you need it, but no more. Do not tie up resources when they aren’t needed, let someone else use them. It’s about the allocation of capacity in a way that makes sense, without waste.
The key to managing your elastic scalability is to employ a strategy that leverages minimums and maximums. To successfully implement such a strategy you must first, however, know what those are in terms of application capacity. You need to do some monitoring and capturing of data to determine what combination of resource utilization – at a minimum – is required for your application to (a) perform up to expectations while (b) being always available. This is sometimes overlooked but should receive more scrutiny than perhaps its counterpart. It’s easy to automatically launch new instances to increase capacity but it’s not so easy to decrease capacity while simultaneously maintaining performance and availability. So you’ll need to know how “low you can go” and ensure that lower bound is not ignored when scaling down resources, just as you’ll want to know your financial limitations (if you have any) on launching new instances to ensure you don’t exceed them when scaling out.
You’ll also need to prioritize factors in the equation. Is performance number one or is that budget? Can you run over budget as long as your application is always available? Can you sacrifice performance in the name of staying under budget? This is important because even though a management framework of application delivery solution may provide the means by which can implement such a strategy, you have to implement that strategy. Only you can set the priority of the variables required by the infrastructure and frameworks to act on the results. Only you can decide what’s important to you and your organization. That’s part of what’s meant by “control” and “management” of the cloud, and why some organizations are balking at its use: a lack of management and control.
When looking at the drawbacks of cloud, management (54%) and security (27%) are seen as the biggest inhibitors.
Bjarne Rasmussen [Chief Technology Officer & Senior Vice President EMEA, CA] went on “Forward thinking companies are today beginning to understand how to dynamically manage virtualised environments to deliver infrastructure as a service, giving greater flexibility and the ability to provision computing power on demand. This creates the internal cloud – ‘unleashing the power of virtualization.’”
-- European businesses take first steps toward Cloud Computing
The other part, the part you can’t control, is simply a matter of maturity.
You’ve got to ask some questions because this may not be rocket science but it is computer science and there are multiple factors that need to be considered. Consider that if availability is your number one priority and the cloud provider (or your own load balancing solution) can’t support quiescence
(bleeding off connections) then you can’t decommission instances until all current sessions have concluded. That also means you can’t (or shouldn’t) direct new requests to that instance to allow all users connected to finish out their sessions. Which means you need to understand the way in which a cloud provider directs requests to scaled up application instances: the very act of scaling down an application impacts the way in which it scales. This is why it’s critical that providers are more transparent about the capabilities of its infrastructure; that the infrastructure is comprised of black boxes, not invisible boxes.
It’s an intricate dance, this ability to monitor and manage application availability while maintaining performance all within the bounds of a budget, but it can be done with the right tools and with a strategy for managing applications that doesn’t stop at the virtual container. That’s one of the hallmarks of Infrastructure 2.0 and its collaborative nature – enabling the sharing of actionable data across the infrastructure and management frameworks to enable the dynamism folks expect (by definition) from cloud computing and automated virtualized data centers.
1 If they don’t include elastic or rapid scalability in their definition of cloud then someone is trying to sell you something akin to rack space in their Okefenokee Data Center
2 Assume pay=a and use=b and need=c, then a=b and b=c therefore a=c








