Data surfing with big data

The usual big-data story leaves out crucial bits. We hear about the “what” — big, huge data of all kinds. We hear about the “when” — now and coming soon. We hear about the “how” — Hadoop with helpers. But we almost never hear about the “who” and the “why.” Who’s bothering to analyze all this data, and why?

If we believe big data’s usual, small-bore spokespeople, the whole thing is little more than getting a big enough machine to crunch mountains of data. But if that’s all it is, then all we have is warmed over business analytics. As I endure minutes upon minutes of Hadoop-speak, I’ve often grumbled to myself that if there’s anything more to this story, I sure wish someone would cough it up.

Finally, someone has. In July, president of BI Research Colin White and director of business analytics at IBM Harriet Fryman gave a refreshing presentation at the annual Pacific Northwest BI Summit, held in Grants Pass, Oregon. Yes, big data’s a big deal.

Though Colin and Harriet listed nine conclusions, I derived my own: Big-data analytics can become more than a cost, it can become a profit center and an asset. Second, high resolution is a better way to think of big data’s function than any others I’ve heard.

Unlike most talks, they supported their thoughts with actual cases — the most interesting of which was Sears, the stumbling brick and mortar chain. In April its big data operation opened its doors as a subsidiary to non-competing retailers. MetaScale’s sole purpose is to help find meaning in big data. A cost center has become a new business.

Like most big deals in business, this one echoes the past: the story of Southern Pacific railroad’s phone system. During its long domination of the West and Southwest, it had built a vast telephone system. By the mid ’70s, a few train crews actually had early mobile phones, bigger than a pork loin. Then cash tightened in the late ’70s, and that cost center became an asset when it became Sprint’s foundation.

The data itself should be thought of differently. Big data’s per-unit value is lower and has an inverse proportion of the volume and value. While we groom and pore over transactional data, with big data we throw the stuff around with shovels. Its value is in bulk because it shows value with patterns. Much of that big data, in fact, may end up discarded.

Dare I compare big data to TV viewing? Faced with either one, we may glance, evaluate, and in a blink decide to discard one sample and dwell on the next. With a remote in hand, we say we’re “channel surfing.” A comparable willingness to load data and discard it could change the whole game of analytics. Jill Dyché said, “‘Here’s the data. Go play.’ ‘Because I can’ isn’t a good reason in data warehousing,” she said, “but in big data it’s perfectly OK.” That, she said, is a “game changer.”

To sense big data’s potential, we may again think of television. The early, vacuum-tube powered TV was monochrome and a little weird. A novelty, but a poor substitute for radio. But by the ’60s, the picture had cleared up and by the mid-’60s shows were seen “in living color.” People hurried home to catch new episodes. Today, we’ve got HD on iPads, and the effects are still unfolding. It’s been video all along, but each improvement changed applications profoundly.

If all today’s experts can do is describe big data in terms of tools they know, of course it will sound like little more than new and improved BI. Big data dares us to think much, much bigger than that. It may challenge our tools for the moment, but in the long run it’s a bigger challenge to our imagination.


  1. I still remember our first color television set. My dad built it. It was a Heathkit, and he must’ve labored over it for more than a year. He finally finished it…in 1977, I believe? I remember — that very afternoon — going from Tom & Jerry in black and white to Tom & Jerry in color.

    It blew my mind.

    But we didn’t get our color TV until the mid-70’s. Color CRT (to say nothing of color broadcast) technology had been available for a couple of decades (and more) at that point. Hell, Hogan’s Heroes had switched over to color almost a decade beforehand. (I say this because, had either Hogan’s Heroes or F-Troop still been on the air in the early- to mid-70’s, we likely would’ve gotten color TV much more quickly. My dad, you see, was a big fan.)

    The allure of color TV was can’t-miss: to see it was to want it. But as I understand it — I was a wee toddler at the time, so I’ve had to do a bit of reconstruction — there were oodles of (mostly practical/economic) market barriers to adoption: the prohibitive cost of early color television sets, the (comparatively) slow adoption of color film by the major networks, the even slower adoption of color broadcast technology by network affiliates, and the lack of international color broadcast standards. The upshot, then, is that even though Color TV was that most authentically compelling of creations, it took more than 20 years for it to gain mass acceptance.

    This isn’t to take issue with your column at all. I think you’re spot on. Things likewise move much faster these days than they did 40 years ago. So the practical/economic barriers that functioned to limit color TV adoption will rapidly get paved over — or bulldozed. The Sears case is likewise a compelling one. It’s one of the few such compelling cases we’ve yet heard. It also gets at what’s likely to be big data’s biggest revenue-generating role for the foreseeable future: i.e., a boon(doggle) to the services and integration industries. In this case, Sears’ internal efforts with Hadoop did target a critical line-of-business need. They did result in substantial ROI, according to Colin. Their long-term revenue-generating function, as you point out, is in the services or integration space. It’s in helping other big data tyros get up to speed on Hadoop.

    So it’s kinda circular, in this respect. We’re starting to hear more about big [data] analytics as a more imaginative or compelling — actually transformative — proposition: e.g., the idea of matching events in transaction data with signals — with events, trends, anomalies or other kinds of information — in the big data broadcast (to play upon the term in its original sense). But this is a relatively new development. And I haven’t yet seen — or spoken to — anyone who’s actually doing this. This is analogous to mainstream adoption (at both the network affiliate and consumer levels) of color television. This is the use case that — should it pan out (and it does have its detractors — has the broadest and potentially most compelling applicability.

    But I’d wager that even in this age of bulldozing down barriers, it’s going to take some time.

    In the interim, we’ve got noise. Lots and lots of noise. And not simply in our (accumulating, aglommerating) volumes of big data! Which is to again underscore your point: there’s an almost stupefying herd-think associated with big data: it’s l’big data pour l’big data: you need to do big data because you have big data. Well, yeah. Except I’ve had big data for decades now. The typical big data discussion — as discussion — completely ignores the historicity of big data itself. That’s one reason, I think, you’ve thus far been deaf to its charms: because — as it’s commonly described — it doesn’t have so much as a stitch of charm.

    I think we’re likewise being too insular. What we’re calling big data is going to change everything, as I’ve said to you. It’s going to transform how we know and understand — how we perceive — the world. But it won’t do so on the basis of what we today understand by (or intend to designate with) the term “big data.” As Clint Eastwood’s William Munny tells Gene Hackman’s dying Lil’ Bill Dagget in the best damn scene in **Unforgiven**: “Deserve’s got nothing to do with it.”

    Big data, as many of us in the BI/DW space describe it, has got little to do with What’s Happening; it’s got almost nothing to do with What Is to Come.

  2. Dan Murray says:

    The key to the value of big data is the ability to probe it in ways that yield new insights that become information to be acted upon. Without that – we are just playsing with ourselves.

    The tough part about selling the idea is that you really need to have made insights with “little data” before you can confidently invest in big data platforms. More people haven’t experienced this type of discovery.

  3. Tarun Loomba says:

    Ted – nice article…the piece that struck me on your analogy was not so much time to adoption, but the utilization of the technology. Initially, TV (b&w) was merely replicating what was on the radio – and indeed it was a poor substitute as you mention. It was only after the content (and to some extent the technology) ramped to leverage the new nature of the medium that it was differentiated and worth the cost…

    In DW/BI/Analytics, the analogue here is that the new technologies should be focused not on replacing what the existing infrastructure does well, but what it (the new stuff) does uniquely well – and as you note, in use case-speak, not techno-speak. This will enhance what businesses can leverage and thereby deliver value much beyond capabilities today (as Colin and Harriet’s examples showed).

    Of course, this leads us to another one of the topics we discussed at the Summit – now that you have these tools with unique and specialized capabilities, how do you unify them to the user so they only see the integrated value…likely a different blog topic, though ☺

Leave a Reply

Your email address will not be published. Required fields are marked *