Do data and stories make an odd couple?

Who’s the boss around here? Is it data? Or is it stories? I say stories. But a friend and fellow writer in the data industry sees special status for data. It’s not just another fact like any other humanity has used as ready fodder for stories, he seems to say. It is like a compass, pointing us to truth, just truth.

We spent a significant part of a Sunday afternoon — him in Nashville and me near Berkeley, and sunny in both places — slinging text messages back and forth. Then it began again on Monday.

Steve Swoyer worries, in short, that stories demand drama of data that data may not be able to provide with integrity. “If we are interpreting data and using data to tell a story with it,” he texted, “the story will not comport with the conventions that we require from our stories.”

I replied, “Who said data storytelling has to conform to traditional forms? Every new medium twists storytelling its own way … I think you’re defining storytelling too narrowly.”

“Bah,” he replied. “I’m talking about what we expect from stories. Do we intentionally tell uninteresting stories? More to the point, how will storytelling be used by folks who aren’t careful how they analyze and interpret information?”

He continued, “If I’m thinking of it too narrowly, I’d encourage you to see it as a problem. Why do we like to tell stories? Why do we like to hear stories? And wouldn’t we prefer to hear some kinds of stories as against others? There’s the problem right there.”

I protested, “But we’re still human. We always look for meaning.” We’re going to hear meaning in any phenomena whether it’s from the storyteller, from someone else, or from our own fertile little minds. We will always find or invent a story to explain anything.

I do see a problem. It’s in the belief that data is produced in a virginal birth. Data is, after all, a human creation that springs not from some divine source but from everyday mortals who tell each other stories about the world. From those stories come questions. Questions beget measurement, which begets data.

He offered what sounded like a solution. “We need to learn a new kind of storytelling, a probabilistic storytelling,” which is “indifferent to the conventions of traditional human storytelling … Without controlling for the human appetite for drama, etc, we can bend and break data into unhelpful ways.”The next day, he praised Nate Silver for treating data sensitively in his New York Times analysis of polls during the 2012 campaign.

He urged me to reread his article from last June. In “Storytelling Reconsidered,” published in the Radiant Advisors publication Rediscovering BI, his point finally scored a hit.

In sales- or marketing-speak, to tell a story is to make a pitch. Life doesn’t so much make pitches as throw pitches … Some of these pitches – like Cliff Lee’s baffling curveball – can be hard to spot. They’re breaking balls, with lots of late movement, as distinct to big fat fastballs, fired right down the middle of the plate…

Which leads him to Sabermetrics, “a kind of storytelling.”

[It] doesn’t simply explain but is also able to predict the performance of pitchers…We must develop and hone a new kind of business storytelling: a statistics-informed storytelling. This isn’t going to be easy … Business, like life itself, has an infinite set of parameters, not all of which are well understood, and some of which have yet to be identified. But the success of Sabermetrics … provides an encouraging example.

Well, why didn’t he say so? I’ll settle for that “kind of storytelling.” As I said, data storytelling isn’t necessarily a kind of storytelling we’ve seen before. But it’s still storytelling.


  1. I can’t think of anyone better to write about this issue, Ted. Sure, I’m a friend, and — sure — I have an interest in promoting you, especially inasmuch as you owe me a couple of six-shot cappuccinos, or a few bourbon six-shooters, or a six-shot combo bourboccino, which I’d really like to collect on one day – [Aside: as poet T.S. Eliot might put it: “HURRY UP PLEASE IT’S TIME] – but you grok the problem (or the problem field) as few in the industry do.

    I’ll be interested in seeing where you go with/how you develop this. I say this somewhat tongue-in-cheekedly but also somewhat seriously: this storytelling thing could become for you what the field of “Hitler Studies” ultimately became for protagonist Jack Gladney in Don Delillo’s **White Noise**.

    And you won’t even have to learn German to master it!

    That said, I want to expand (philosophically) on your invocation/equivalence of truth and data in the first graf. The term “data” is for me a fraught concept. I tend to distinguish between what we mean by “data” and what we mean by some-thing like “information” in that “data” is, explicitly, *a problem*; “data” is that which has become problematized; e.g., it’s “information” *that’s of interest to us*. Put another way, data is structure *that we identify-in/impute-to* a manifold of noise, it’s signal that’s been identified-selected-reduced, and (this, the important step) *appropriated* – in this case (with respect to analysis) as schema. To speak of “data” is to assume appropriation and representation: to speak of “data” is always to imply schematization. This is a problem, as I see it, because the world of events … is indifferent to, or happens without regard for, schema. We might be able to *impute* schema – as we do when we develop ontologies or taxonomies or classifications – and, for all intents and purposes, the schemata we identify/impute might correspond to the events/phenomena of the world, but *it doesn’t have to*. There’s necessarily no implicit or inherent necessity to our ontologies, taxonomies, classifications, morphologies, etc.

    When we’re using data to tell a story, then, we’re working with a pre-determined structure: with a canned view or representation (a model) of the world. We must be mindful of this, even as we’re mindful of (and try to control for) the tensions or tendencies or temptations to which the conventions of storytelling … tether us. And if you want to read in what I’ve said an absolute rejection of absolute truth, then, yeah, have at it. Because (as with Francesco Rinaldi) it’s in there, too. I am by no means an acolyte of philosopher Stanley Rosen, but a line from one of his texts (**The Ancients and the Moderns: Rethinking Modernity**) has stayed with me lo these last 20 years:

    “There is … a kind of Goedel’s Theorem in human affairs. Every attempt to systematize life or to govern it by a set of axioms rich enough to encompass the totality of experience leads to a contradiction.”

    So there’s that.

  2. Thanks for this article Ted.

    I agree with both of you. The reality in most organizations is that they can’t even understand what has happened and is currently happened. Adding a probabilistic approach is ideal, but this can’t happen until you have the data about the process and the process is understood on a historic basis.

    Sports analytics like baseball is a very well-understood topic that is often studied by experts with lots of references. Additionally, the complexity of the data for sports problems is often much lower than the complexity of data in many business and government organizations that we deal with (data from 5-20 sources and only small parts of this data is well-managed, much less managed for ease of analysis.)

    So, simple story-telling is often desperately lacking in the business world, we must enable teams to have this capability prior to worrying about more sophisticated, smarter story-telling that would also add significant value.

    I would note that well-understood, mission-critical processes such as customer acquisition and pipeline analyses are areas where both simple and advanced storytelling are most likely to thrive and appear in the near future.


Leave a Reply

Your email address will not be published. Required fields are marked *