≡ Menu

Stephen Few: data’s “harmful ways”

Visualization guru and data-industry skeptic Stephen Few in has a worthwhile review of Weapons of Math Destruction by Cathy O’Neil.

Data can be used in harmful ways. This fact has become magnified to an extreme in the so-called realm of Big Data, fueled by an indiscriminate trust in information technologies, a reliance on fallacious correlations, and an effort to gain efficiencies no matter the cost in human suffering.

Read his review here.


Qlik road goes past white coated smart guys

An earlier version of this post, with a different conclusion and minor differences, appeared in late November.

Qlik CTO Anthony Deighton was drying his hands on the thick, almost cloth-like paper towels in the men’s room at the Miami Edition hotel at the recent Qlik analyst event. He heard another man in the room comparing the towels to some at a past hotel. There, the man said, the towels were thin, the man said. Deighton replied, “They were useless, like Tableau.”

Back in 2012, the T word was barely uttered at the gathering that year of industry analysts. Not long before, the bright and playful Tableau had just stung the plodding, script-laden Qlik in what felt like a surprise attack. This year, Qlik seemed to have regained its poise — and two dozen or so industry analysts gathered at the hotel with good paper towels to hear about the progress.

First, the analysts wanted to know about the buyout. As of late 2016, Qlik’s no longer publicly traded. CEO Lars Björk introduced Chip Virnig, a principle at the private equity firm that bought Qlik, Thoma Bravo, and now a Qlik board member. The buyout is “a very big bet” for the firm, he said, but it felt not only “safe” but also well positioned to thrive. Deighton, speaking afterward, praised the new “cloak of darkness that frees management from an old distraction, public scrutiny.

Decline of “white-coated smart guys”

Qlik sees the end of BI “as a destination,” in which “white-coated smart guys” serve hapless data consumers. This is the beginning of BI “as a platform,” Deighton said, that feeds on a wide variety of data sources, whether on the cloud or under a desk, which then supplies bits of analysis to vertical applications.

You might imagine BI disappearing into everyday business. Applications will serve specific needs and embedded apps will weave into “real work” throughout the day. Deighton cited the Uber app, which is at first glance hardly a data-analysis tool. It’s only under the hood that that shows itself.

During a break afterward, a few analysts grumbled about Qlik’s road toward the cloud: “They’re late,” said one person. Later, others seemed to agree.

Does lateness matter? As if in defense to the grumbles outside, Deighton declared, “I don’t care what competitors do. What really matters is ‘know thyself.’” Imitations usually compare poorly with the original. You’re better off knowing what you do best and doing it for all you’re worth. I agree.

Sticking to the be-who-you-are strategy, they stick with three known Qlik differentiators. One is the platform and a second is its traditional fondness for governance, which has given Qlik an edge on Tableau.

The third differentiator is the troublesome one: the “associative experience.” The concept is easy: It answers not only the direct question, “What’s in this set?” but also the implied question, “What’s not in this set?” Hey boss, it might say, I know what you asked to see, but did you notice this over here?

Actual examples of the feature at work seem scarce. Many of the supposed proofs don’t prove anything: money saved, decisions made, and other fine outcomes that fail to demonstrate the feature at work. I can recall only one example that truly illustrates the value: IT consultant Don Marks, one of four Qlik customers flown in for this year’s UnSummit, told me in a one-to-one meeting about a fraud-prevention project at a bank. They had managed to suppress fraud in areas where it had occurred. But then Qlik Sense let them see it pop up in areas they hadn’t thought to look.

Tableau users I’ve heard from seem to think little of the feature. They compare it poorly to Tableau filtering, though Qlik argues that by definition filtering would have hidden the rebounding fraud.


Deighton asked the assembled seers, “Does this resonate?” Well, sure. If you squint, you might even see his trends coming true already.

But what does it matter? What’s it matter that Qlik is, as some say, “late” to the cloud? What’s it matter that it can do some things better than Tableau or any other tool? Each constituency insists that their chosen tool is more useful. Each side trivializes the other’s advantages. Each one’s pitch to industry analyst assumes roughly the same trends. Only the emphasis varies.

My impression: Generally, Qlik seems to be building out to a bigger, bolder ecosystem. Its three differentiators — platform, governance, and the “associative” feature — contrast boldly with Tableau’s differentiators, which seem best expressed as flow, art, and expression.

Overall, Tableau looks like fun, and Qlik looks like work. Both are useful, but each tells a different story.

Which one would go on my short list depends on the type of organization. Qlik if users wanted a routine and relatively limited set of analyses to be used in mature organization. Tableau if analyses had to be more free-ranging and used by more intellectual or creative people than the average business user, in a dynamic and creative organization.

Useful stories

Any paper towel is useful depending on location and intent. To compare brands, some might turn to the Handle-O-Meter, an actual machine developed by Johnson & Johnson to measure surface friction and flexibility — the same way that industry analysts like to add up features.

But the Handle-O-Meter is useless for judging a towel’s important aspect: its message to the user. Does its plush, silky finish tell you that you’re a treasured guest worthy of comfort? Or does is say with its cheap, rapidly disintegrating fiber that you’d better hurry up and get out?

What do Qlik and Tableau tell the user? Tableau says, “Ask! Explore! Play!,” which appeals to some cultures. Qlik says, “Be serious!” which appeals more to other cultures.

Deighton’s quip is fair enough. But whichever is more useful depends on who’s asking and why.


Malcolm Gladwell: why oral data’s different

Why would you present data orally instead of in print? You might think that if all you have is data, why bother with the sweaty palms? Just post the paper online and let people read it!

Not if you want to test your conclusions. Oral and written renditions have different effects, and elicit different responses.

Malcolm Gladwell told how he realized this on an always interesting podcast, the Ezra Klein Show.

Here’s Gladwell’s explanation almost verbatim, starting at 57:25.

My father taught math and would be constantly going off to conferences. I was always very skeptical.

I thought, What possible value is there for him to go and present a paper when they could just send someone the paper? It’s equations! Isn’t it just easier to just read the equation! My father’s going to Istanbul this summer because they all want vacations in Istanbul.

But now I realize that there is a reason there’s so much emphasis in academia on person to person oral presentation of arguments, data, etc. When it’s presented in oral form it’s so easier to honor the conditionality of the work. To argue with it, to fix, to backtrack, to amend, to do all those things. The minute it’s on paper it has a kind of permanence and authority that maybe it doesn’t deserve.

…Whenever I read an academic paper, I try to imagine the author presenting it orally. And that helps me to not jump in head over heals with some of the conclusions. Just to imagine their voice when they came to the conclusions! In every conclusion in an academic paper, there’s a point where they recognize all the potential problems with the conclusion. This may not be true because of A, B, and C.

When you’re reading it, you skip over that. You know, “yeah, yeah, I’m not interested. I just want to know what the conclusion is.” But if they were presenting it, you know you’d that should be a crucial point of the presentation. Everyone in the room is waiting for you to go through all the reasons it might not be true. That’s where the discussion’s going to begin.

So the very thing that’s almost principle importance in an oral presentation is an afterthought when it’s on the page.


The six genres of data stories

This appeared originally on the TDWI site in September behind a paywall. It’s still there, but today they’ve had the 90 days of exclusive use that I agreed to.

Survey after survey reveals that about 80 percent of business users don’t use data analysis—despite all the marketing and “easy to use” tools.

As if in response to this sad showing, renowned author and academic Tom Davenport proclaimed that data scientists should know “data storytelling.” He’s right. Storytelling has transmitted knowledge and motivated action in every medium we’ve ever known. Stories around a fire, stone tablets, Gutenberg’s books, news, and e-books have all made use of stories. Data is a natural.

The data community lost no time swarming all over it. Trouble is, most of them seem to have heard “data” but not “story.” Even now, years into the data story trend, they still play mostly to each other with the only genre they seem to know, the parade of visualization—a waste of time for all but the already initiated.

It’s not so hard to reach non-data users with other genres, which are just sets of conventions that satisfy different audiences and moods. War movies, for example, deliver noise, action, and beefy male heroes. Romantic comedies deliver jokes, pastel scenery, and romance. Each genre satisfies different needs.

Here are six data story genres. The “naked data” genre seems to have become the default; search Google for “data story” and that’s what you find. Although the other five genres are barely recognized as data stories—I’ve never found any labeled “data story”—that is what they are.

Genre 1: Naked Data

The naked data genre lets data march alone. It is ideal for those who find data exciting. Search Twitter for #datastorytelling and this is the type you’ll find.

The naked data storyteller is like the host of a stone soup lunch. “Here’s the data,” guests are told. “Now make of it what you will.” The data-loving guests unpack the sack of knowledge they carry with them and apply it with their own curiosity and determination.

Genre 2: Narrated Data

Naked data transforms easily into the narrated data genre. The mother of them all is Hans Rosling’s 2006 rendition on childhood mortality around the globe. Rosling’s animated bubble chart now seems dated, but his presentation is timeless. His passionate narration explains the movement of the bubbles like a sports announcer at a football match. The data is more than interesting—it is thrilling. In a later instance, he told another story, this time not with computer visualizations but with pebbles on parking lot pavement. A parking lot space never looked so good.

Genre 3: Explainer

The explainer genre consists almost entirely of words. It uses one or two visualizations, if any.

The Upshot column in the New York Times makes frequent and effective use of this genre. A recent story on the U.S. economy, for example, runs about 800 words with a single, simple visualization. In “GDP Better Than It Looks,” the author explains that, although growth in the second quarter of 2016 was just 1.2 percent, this was almost entirely the result of a contraction in business inventories. That’s not a good predictor of future growth, according to the author. A much better rate of 2.4 percent shows up when looking at GDP excluding inventories, because final sales are a better measure of underlying growth. The bad news comes in shrinking investment and poor growth in productivity.

That offers plenty of data about GDP and is about as much as many people want to know or have time to think about.

Genre 4: Executive

This is for the executive suite. It is brief, perhaps just a minute long, and it may contain little data—sometimes none at all aside from footnotes that cite the underlying data.

A monthly report at financial services firm Charles Schwab, for example, is compressed into a 60-second story in several steps. First, a data analyst dives into the period’s data and comes up with questions and preliminary conclusions. Then a bigger group with representatives from marketing, HR, and other functions joins the discussion. Each person has a take on the period’s events and results.

From there, John F. Carter, senior vice president of analytics and business insight, distills the story further, as he described it to me. The presentation begins with the main conclusion, similar to news reports. Less is more, he explains, “but the right less.” Executives don’t have time to get into the weeds.

Another executive I spoke with—a veteran Silicon Valley CFO who requested anonymity—dismissed “the illusion of certainty that numbers provide…. Execs, at least the good ones, know they are dealing with a messy and uncertain world.”

Genre 5: Detective Story

This one starts out as an explainer but ends with a question. We have a mystery, the storyteller says, but we don’t know what it is. As in a traditional detective story, the audience gets all the facts—nothing’s hidden. Yet this is no game and the storyteller needs help.

Take the declining balances case that longtime TDWI instructor Dave Wells and I have used in our data storytelling class at TDWI conferences. Bank executives have come to recognize declining balances across multiple account types. Why? What can be done to reverse it? The keys are the customer stories behind this behavior, many of which weave into a bigger story. It all leads to the answers.

Genre 6: Scenarios

With scenarios, storytellers start with data and imagine a reality that may develop from it. The data sets the stage and imagination takes it from there. German data scientist Joerg Blumtritt approvingly described to me this kind of data story as “fiction.”

Fiction shouted to an audience of data people empties the room—even though many of them already create stories that are actually fiction. They are extrapolations of data to imagine future events. For example, credit scores are based on data of past behavior to predict default.

An even more sophisticated kind of fictional data story underlies scenario planning. When presented, scenarios may offer mere crumbs of the underlying data. In the 1970s, Royal Dutch Shell famously predicted several trends that competitors hadn’t foreseen. Scenario planning helped warn Shell’s leadership of the 1973 energy crisis, the late ’70s oil shock, the fall of the Soviet Union, and the rise of Islamic radicalism.

More Stories Ahead

There’s a genre for every business user and more than a few I haven’t thought of. The missing 80 percent are waiting.


Qlik finally set to leapfrog Tableau?

Who’s your rival? I carelessly asked a Qlik person at the company’s annual analyst reception Monday night in Miami if she hadn’t once worked for Tableau. Her revulsion was immediate. “No! Never!,” she said.

We smiled. There was so much more to talk about. For one thing, how will private equity change things? Qlik wasn’t doing so well at the public-equity thing, you may recall, and over the last few months they went private.

Knowledgeable Qlikkers assure me with apparent sincerity that “good things” will ensue. I can think of no reason to doubt them. It must be nice to have the riffraff off your back, which one experienced business person described to me as “having ants in your pants.”

Tableau’s still public, though not quite as shiny as it was. It has that well worn feel of a recently plush restaurant. No one notices in the mood lighting and boozy good vibes, but the cleaning crew sees it plainly enough when the bright lights go on after closing.

To be rid of ants might just set Qlik on the way to leapfrogging Tableau. Old-timers will recall that Tableau was once the upshot that caught Qlik by surprise. Now Qlik might show off what it’s learned.

We’ll see on Tuesday and Wednesday.