If you think you can do big data in-house, get ready for a lot of disappointment. If the data you want to analyze is in the terabytes in size, comes from multiple sources — streams in from customers, devices or sensors — and the insights you need are more complex than basic trending, you are probably looking for a data scientist or two. You probably have an open job requisition for an Hadoop expert as well and have hit the limit on what your capital budget will let you buy to house all this data and insights. Thus you are likely taking a hard look at some cloud-based options to fill your short term needs.
Well get ready for your desires for the in-house capacity and staff of experts to go unfulfilled long term. Data scientists are a rare breed and the bulk of Hadoop experts in the market are being snapped up quick by companies building and offering MapReduce services like HortonWorks, CloudEra and GoGrid. And unless you have IPO stock to offer or are an attractive acquisition candidate, you will likely find it hard to win this rare breed of employee – let alone hold onto the ones you have already. The financial and skills shortage realities around big data will drive much of your desire for better customer insight and creation of predictive applications to leverage pre-built big data services that reside in the cloud.
This growing reality is what prompted Mike Gualtieri and myself to take a look at the burgeoning market of cloud-based big data services and help you make smarter decisions about which are best for your business. Through customer inquiries, we find that conventional wisdom suggests there are two types of big data cloud services – high-level SaaS-based reporting tools and deeply technical do-it-yourself platforms. Well the market is far more nuanced than this. In fact, even in this early stage of market evolution there are ample solutions aimed at the wide range of data analytic skill sets. Want to set up the environment yourself, exactly to your specifications? Yes, those exist. Don’t want the hassle of setting up and managing the cluster but want to choose and tune the algorithms? Another set of services do this. Whether you are a scientist, DBA, coder, rapid application designer, or business intelligence professional there are cloud-based solutions suited to your skillset that you can leverage in a pay-per-use fashion right now.
And nearly all these services can be consumed via cloud economics – pay for what you use, only when you use it. How can you best choose the right solution?
Start with the business problem you are trying to solve before selecting a cloud service. The type of insights you need will dictate whether you must reinvent the wheel by building your own solution or whether you can leverage best practices and already-conducted analysis that you simply need to customize a little to get to the answers you seek. Then map the workflows that will be necessary to meet these business objectives between your big data skilled professionals.
Your analysis objectives may entail the DevOps teams setting up initial analysis they then pass to coders or business professionals for further analysis. As such, don’t make this workflow a painful integration, migration, or transformation process. Where your data can reside and how easily it can be integrated with the cloud will dictate the final selection. Be mindful that you ultimately control data placement, privacy and protection – not the cloud provider. This actually opens up more opportunities that it limits.
What solutions fall into which of these categories? Read the report today.