Automation, the Cloud and the Quest for Greater Reliability

By Arthur Cole

The rule of thumb at most enterprises is to keep mission-critical data and applications on safe, secure, internal, even non-virtual infrastructure. Back-office processing, bulk storage and other low-level activities can certainly benefit from the scalability and high utilization that the cloud offers, but some things are just too important to put at such risk.

Keeping your most valued possessions close to the vest is a time-honored tradition, of course, but these days most people still put their money in the bank rather than their mattress. For IT, then, it might be time to reassess what we think we know about cloud reliability.


Is it possible that the cloud is more dependable than private infrastructure? CloudTweaks.com’s Walter Bailey claims recent studies show that the cloud industry overall is delivering 99.95 percent reliability, compared to 98.5 percent for traditional data centers — that’s a 30-fold improvement, the difference between 130 hours of downtime per year vs. maybe five. The impression that the cloud is unreliable stems largely from the high-profile outages at Amazon and other top providers that put numerous services in the dark at once. But like media coverage of plane crashes, these outages tend to skew public perception because they highlight the consequences of failure, not the fact that it is exceedingly rare.

The ideal, of course, is 100 percent uptime. But as with most ideals, this one is completely unrealistic, says Spanning Cloud Apps’ Mike Pav. In the first place, there are no absolute guarantees for anything in life, and secondly, the time and expense needed to produce perfect or even near-perfect performance would outweigh the losses incurred by a few hours’ downtime per year. A much better strategy, then, is not to avoid outages at all cost, but to streamline the recovery process as much as possible for when the inevitable does happen.

Of course, that would require greater reliance on that perpetual bane of IT staff: automation. As tech blogger Jasmine McTigue points out, computers are much more adept at keeping the data flowing than humans, particularly in high dynamic, virtual environments. But rather than look at automation as the enemy, IT would do better to recognize that job responsibilities would shift from management and maintenance of actual infrastructure to up-front policy and governance establishment. In this way, IT still plays a vital role in the enterprise and overall performance improves because much of the recovery work has been done ahead of time.

Indeed, automation is no more a threat to process control than encryption is to security, echoes UC4 CEO Jason Liu. In fact, the entire data environment can only be enhanced through a robust automation platform, and that will result in superior business performance. Without question, this will change some long-standing roles in the data center as IT operations evolve into development operations (DevOps) and ultimately business operations (BizOps). As countless industries have discovered before, though, automation is least disruptive when it is welcomed and leveraged rather than resisted.

Service interruptions are never fun. They usually happen in the middle of the night, forcing overworked, bleary-eyed technicians into quick action to make things right. No amount of automation or preparedness will change that, but with a little forethought it is possible to minimize the damage.

And ultimately, that will allow the industry to view things like reliability and uptime in terms of performance, rather than the consequences of a single event.

This article was originally published on 2013-02-22