Month: August 2014

Got analytics? Who will promote the industry?

Business people have everything. They’ve got data, and often it’s clean. They’ve got tools, and many are easy to use. They’ve got visualizations, many of which help. They’ve got domain knowledge, at least most do. What some front line observers find they lack is analytical thinking.

Given descriptive data, few business users that BI icon Claudia Imhoff sees ask even the simplest followup questions, like “why?” Is what the data shows good for the business or not? If not, what can be done?

Claudia, along with IBM marketing director Harriet Fryman, raised the question this summer at the Pacific Northwest BI Summit. This week I caught up with Claudia on the phone.

“The big elephant in the room is that they [business people] don’t know what analytics is,” she said.

What will it take to solve that? Education, of course. But by whom?, I asked. Some education takes place within some organizations, but the quality and reach varies. Isn’t a broad, industry-wide program necessary? Doesn’t the industry need the equivalent of a “Got milk?” campaign?

You may remember the ads. To a variety of problems, milk was always the answer. The “Got data?” campaign would promote analytical thinking.

“I like that idea!” she joked. But really, perhaps analytics itself could use a boost.

The big question is who can do it? As Claudia pointed out, we should count on no help from tool makers. The only answer they know is that you’re not trained on their interface.

“This isn’t a technical problem,” Claudia said, “It’s a business problem.”

Assuming her premise of analytical disinclination is valid — I can think of one BI anti-icon who would disagree — the ideal organization to lead such education and advocacy would have several characteristics: First, it would be well known already within the industry for training. Second, it would have relationships with vendors eager to support research. Third, it would have relationships with industry experts, both technical and business.

Will anyone step up?

First sip of Context Relevant hints at a winner

Most product vendors can talk for half an hour without even one specific case. But Stephen Purpura, CEO of the predictive analytics startup Context Relevant, actually had a story about something useful — the time his system used scant data to correctly point to a barely known winemaker.

Spotting the wine started with a sip, of course. But as you might imagine, Purpura then fed it to his prediction machine.

Context Relevant’s critical edge is speed, he explained. That comes from good caching; the better you cache, the faster you retrieve, and the faster you can work through questions and arrive at answers you can act on.

“I can almost guarantee the first question is wrong,” said Purpura. “The trick is how fast you adapt.” It’s analogous to arriving at a baseball game: Find the stadium, find the section, find the seat. You aim to do it all before the game begins.

Purpura said that the Context Relevant technology is finally, after two years, fulfilling the vision: to let the user ask a question and get an answer in the time it took to take a sip of coffee. Today, “no one else comes even close to this,” he asserted.

Who is it that actually sits there asking and sipping? For new customers, it’s Context Relevant staff. But he said that with training provided, the customer quickly takes it on. Users on the customer side include anyone from analysts to project managers, quants, traders, data scientists, and IT pros. CR hides the details by default, but statisticians and others can dig in at will.

Context Relevant’s marketing website is lightweight, which I assume is deliberate. Purpura’s trying to reach upper level executives, who are best reached through social contacts. He said the best way to pique their interest is with stories told orally. He goes where they go, and so he serves on boards and attends lots of social events.

Wine is often part of it. At one event produced by the famous Cornell University restaurant school, he said, “this kid had incredible wine.”

The “kid” was Aaron Pott, years later named “winemaker of the year” by several publications. But back then, there was little data on him. He didn’t even have the metric many wine buffs look to for guidance, the Parker score.

Did the sparse data mean game over? Apparently not. Purpura’s machine found plenty in wine forums, including the Parker forum. Comments about wine that did have a Parker score often showed up alongside mentions of Pott’s wines. Mentions by certain individuals also indicated quality. So did Pott’s having worked at wineries that had strong scores.

To a non-technical type like me, sifting through all that sounds easy. But in fact it was, as Purpura put it, “a massive simulation problem embedded within a massive graph problem embedded within a regression problem with millions of parameters and millions of rows of data.” He said that it would be a struggle for most other companies today.

His system inferred great potential for Pott. Years later, in 2012, Food and Wine magazine named Pott winemaker of the year.

To put money on the company, I would need to know more, starting with what the system got wrong. What I do know is that this CEO does a much better job than most telling the story, talking about practical results and not just technology. (Very few industry analysts who like that stuff have any depth there anyway!) What his system did, in fact, makes sense. It’s what I would have tried to do manually.

Another good sign is that on May 20 Context Relevant scored $21 million in new funding. See The Wall Street Journal for details.

I look forward to more from Context Relevant.

BI industry builds tools for itself: Yellowfin CEO

Why aren’t the data industry’s tools more widely adopted? Data-industry experts have fretted for years over the estimated 5 percent penetration. Yellowfin CEO Glen Rabie has an explanation.

“We never contextualize applications,” he said at the recent Pacific Northwest BI Summit. “We always talk about the homogenous product. We don’t know the consumer. We don’t tailor.”

We don’t talk about the “who.” Who uses our tools? And how they’re used? The marketing rarely distinguishes one industry from another. In fact, even when you read closely, you can’t find a unique selling proposition.

Perhaps worse, says Glen, the tools don’t accommodate different modes of comprehension. Contrary to the market buzz, not everyone is basically visual. There’s also audio and, yes, words.

Take the group of lawyers he works with, for example. “A chart means nothing,” he says with a shrug. “But give them 1000 words, and that means something to them.”

When people in the data industry look around for reasons to explain the disappointing penetration into business, they should instead look within the industry. “We’re building the tools for ourselves,” he says, and that leaves out a lot of users. Tools assume too much analytical skill or at least too much motivation when the value can’t be demonstrated.

Instead, he looks tools that aren’t even labeled “analytical.” When you plan a trip on Google Maps, for example, you see which route is fastest.

No, you’re not a data storyteller yet

The designer Stefan Sagmeister has a message for those who would presume to call themselves a “storyteller” when they’re not really: “No, fuckhead, you’re not a storyteller.” I agree, except for one word: yet. Many could become one.

Some of us in the data industry can identify. We’ve shouted something like this at our monitors as we’ve read about supposed “big data.” No, fuckhead, it’s just data! Now comes data storytelling, and we find ourselves shouting again.

Another annoyance is the idea that stories occur only in certain media. Sagmeister makes that mistake. “People who actually tell stories,” he says, “meaning people who write novels and make feature films…”

No, Sag, a story is just a structure. It has a few essentials that many subjects can satisfy. A data story — dismissed by some as vigorously as Sagmeister dismisses the roller-coaster designer — is a good and useful method for portraying analysis, involving non-analysts, and making conclusions stick.

Data storytelling is not an easy form to master — and there I agree with Sagmeister. “I’ve seen a number of films,” he says, “so I must be able to do one.”

Of course this is the most stupidest thought ever. It’s like, ‘Oh, I watched the philharmonic. That’s why I’m a virtuoso violin player.’ You know, well, I’m not, even though I’ve watched a lot of philharmonic concerts.

I’ve seen a lot of things called “data stories,” that aren’t stories at all. Many have nice visualizations, and many are well annotated, and some have a useful narrative. The author loves data. But most are not stories.

To paraphrase Sagmeister, data storytelling — still a hopeful updraft for the data industry — has taken on “the mantle of bullshit.”

Beware: Not everyone’s a storyteller who claims to be one.

See the video here.