Delivering Hadoop as a Service
There's clearly a lot of interest in big data applications, in general, and Hadoop, in particular, as the vehicle through which those applications will be delivered. However, the issue solution providers may want to consider in the age of the cloud is the degree to which they really need to have Hadoop skills to play in the market.
In general, database as a service (DBaaS) delivered via the cloud is gaining momentum, so it should come as no surprise to find that Hadoop is also being delivered as a service.
Case in point is big data specialist Qubole, which recently raised $13 million in new funding.
Employing Hadoop as a service eliminates the need for solution providers and their customers to invest heavily in mastering Hadoop infrastructure and the skills needed to manage it, and that should free up more money for the building of big data applications, said Qubole CEO Ashish Thusoo.
Of course, Qubole is not the only vendor that is addressing this. IBM and Oracle provide both Hadoop as a service along with other database management platforms.
One thing that differentiates Qubole is that it takes advantage of public clouds, such as Microsoft Azure, Google Compute and Amazon Web Services (AWS), to keep costs down, while at the same time allowing IT organizations to apply policies to regulate how that data is actually used.
In addition, the service itself supports applications written in both Python and Java that are readily accessible via RESTful application programming interfaces (APIs). In fact, Thusoo said that Qubole is already processing more than 83 petabytes of data per month.
Given that volume of data, what solution providers across the channel must keep an eye on in 2015 is the degree to which databases move into the cloud. Once a database moves to the cloud, the management of that database becomes the responsibility of the cloud service provider. That means customers no longer have to acquire that infrastructure and they don't have to hire a database administrator (DBA) to manage it, especially if that data resides inside Hadoop or some other type of NoSQL database.
Given data compliance issues, it's unclear just how much data management software will move into the cloud. But not having to hire a DBA is a powerful motivator for a lot of organizations that are trying to reduce their total cost of IT.
As such, solution providers should be paying attention to just how much data is going into the cloud because once the laws of data gravity start to get applied, it won't be long before everything else in the IT solar system starts to orbit around where that data is located, as well. In effect, as more data moves into the cloud, so too will other applications that want to access that data without necessarily having to be penalized by the network overhead and latency associated with remotely accessing that data in the cloud from an on premise system.
Michael Vizard has been covering IT issues in the enterprise for 25 years as an editor and columnist for publications such as InfoWorld, eWEEK, Baseline, CRN, ComputerWorld and Digital Review.