IDG News Service
Hey, it must be hard to be the only person on the planet who doesn’t understand big data.
Actually, that’s far from true: You’re in good company. While Gartner finds that 64 percent of enterprises are investing in big data, a similar chunk (60 percent) don’t have a clue as to what to do with their data.
The real problem isn’t one of technology, but of process. The key to succeeding with big data, as in all serious IT investments, is iteration. It’s not about Hadoop, NoSQL, Splunk, or any particular vendor or technology. It’s about iteration.
Big data, big confusion
Though the number of companies embracing big data projects has grown since 2012 — from 58 percent of enterprises surveyed to 64 percent — the level of understanding of exactly what to do with that data hasn’t kept pace, as the Gartner data suggests.
This isn’t all that surprising, given how hard it is to pull money from data. It’s easy to say “actionable insights,” but far harder to glean them. That’s why data scientists currently outearn most other professions, with an average salary of $123,000, which continues to go up:
Those who do data science well blend statistical, mathematical, and programming skills with domain knowledge, a tough combination to find in any single person. Of these, I’d argue that domain knowledge matters most as it leads to the process of getting value from data, as Gartner analyst Svetlana Sicular hints:
Organizations already have people who know their own data better than mystical data scientists …. Learning Hadoop is easier than learning the company’s business. What is left? To form a strong team of technology and business experts and supportive management who create a safe environment for innovation.
That “safe environment for innovation” is one that affords data practitioners room to iterate.
Innovation is iteration
There are at least two major problems with big data projects. The first is that many companies consider them, well, projects. Big data isn’t a one-off project: It’s a culture of collecting, analyzing, and using data. As Phil Simon, author of “Too Big to Ignore: The Business Case for Big Data,” told me: “Do you think that Amazon, Apple, Facebook, Google, Netflix, and Twitter do? Nope. It’s part of their DNA.”
The way it becomes DNA, however, is the second detail that trips up companies getting into big data: They think it’s a technology issue. While most great big data technology is open source, building out a big data application isn’t as simple as downloading Hadoop or the NoSQL database of your choice. As IDC analyst Carl Olofson highlights:
Organizations should not jump too quickly into committing to any big data technology, whether Hadoop or otherwise, as their solution to a given problem, but should consider all the alternatives carefully and develop a strategy for big data technology deployment.
Such careful consideration happens by iterating. Rather than paying a mega-vendor a mega-check to get started (do this, and you are absolutely doing big data wrong), the right approach is to start small. As Thomas Edison noted, the trick is to fail fast or, as he says, “I have not failed. I’ve just found 10,000 ways that won’t work.”
Big data is all about asking the right questions, hence the importance of domain knowledge. But in reality, you’ll probably fail to collect the right data and to ask pertinent questions — over and over again. The key, then, is to use flexible, open data infrastructure that allows you to continually tweak your approach until it bears real fruit.