Data Warehousing: At Home or in the Cloud
The most significant challenge when dealing with Big Data is scale. How can the enterprise possibly build out the infrastructure needed to compile and analyze all that information without sending the budget into a tailspin?
This conundrum has been the primary driver of data warehousing solutions over the past decade or so, with HP, IBM, Teradata and others hoping to arrive at the magic formula that accommodates both IT and the financial office.
Earlier this month, however, Amazon threw its hat into the ring with the Redshift data warehouse service, a cloud-based solution that the company says costs only a tenth of traditional on-premises platforms. On the surface, it would seem that the cloud is an ideal place to locate a warehouse. Not only is the scalability question relatively easy to solve, but the design, deployment and even integration issues that plague traditional systems can be greatly alleviated. Redshift is operated completely through the AWS Management Console, which many enterprises already employ for the platform’s multitude of services, and can be had for less than $1,000 per TB per year with a three-year contract.
Traditional platforms employ the cloud as well, however, and firms like Teradata say there are many operational benefits to building in-house solutions. The company has teamed up with Alteryx, for example, to enable easily customizable analytics applications that function within the Unified Data Architecture and Big Data analytics framework. Alteryx specializes in software that pinpoints competitive advantages and revenue opportunities that executives would otherwise miss, offering enterprises the ability to turn Big Data from a liability to a strategic asset.
At the moment, data warehousing is still new a new phenomenon to the IT industry, but that is likely to change relatively quickly, according to Gartner. The company’s latest Magic Quadrant for data warehousing and database management estimates that Big Data will become the normal data environment by 2015 and be completely subsumed into workaday IT processes by 2018. Within that time, expect to see warehousing platforms — whether on-premises, appliance-based or in the cloud — learn to not just handle larger data loads, but to handle them faster.
Of course, no one can tell the individual enterprise exactly what it needs to analyze its own data, and vendors, service providers, systems integrators and the like are all pressured to sell products rather than devise elegant solutions. However, new software releases like WhereScape 3D can help organizations devise their own solution based on real-world requirements. The program features a range of modeling, scoping and sizing tools that use the enterprise’s own data as base metrics. In this way, it helps avoid the risk of over- or under-provisioning the warehouse before the project has hit the deployment phase.
If Big Data analytics and warehousing are to become standard operating procedure within the next five years, as Gartner contends, then the time is certainly now for system specs and general requirements to be hashed out. Analytics is one of those things that can be difficult to quantify on paper but prove to be invaluable when pressed into service.
And since most organizations have already devoted huge sums of money to the care and feeding of data, it would be a shame not to take full advantage of it.