Month: October 2010

How to analyze unfamiliar data: circle, dive, and riff

When you come face to face with unfamiliar data, how do you proceed? How do you avoid sending yourself and your shiny “speed of thought” tool slamming into a dead end? Dan Murray’s got a routine — and he’s also got certain music and right-brained books to go along.

Dan’s first rule: “Don’t pre-think.” It’s the hardest thing for people to learn, he says. “If you go into [data analysis] thinking you know where you’re going, you easily miss the granule of gold.”

He’s the chief operating officer and heavy-hitting data analyst at InterWorks, Inc., an Oklahoma-based business consultancy. What seems to me like an unending stream of mid-size businesses from all different industries has kept him running days, nights, and weekends to make sense of each one’s data and unravel old data knots.

From an airport somewhere in the South, he explains, “You have to think like a writer thinks. You don’t know where the story’s going to go.” Screenwriters and novelists often say in interviews that their characters veered off in directions the writer hadn’t anticipated.

He’s been analyzing data ever since spreadsheets first became available in the early ’80s. “I was a huge spreadsheet guy.” Now his tool of choice is Tableau.

The routine goes something like this.

First, get the big picture. Grasp the general outline. How many records do you have? What’s the highest and lowest? For example, if you’re looking at a company’s sales, how many sales, units sold, and so on?

Look for what pops out. Trends often make themselves obvious right away.

Find groups. Build a bar chart to see how it all breaks down. If you’re looking at sales, make groups of products, divisions, for example.

Lay out timelines. Build time series to see any long term trends. Start simply with years, then break it into more detail.

Make maps. If the data contains locations, throw it on a map and see what clusters appear.

Go on tangents. Try making some measures into dimensions. For example, if you have a million invoices, with a range of up to a million dollars, where do most invoices fall? Try cycling through every type of chart. Remember, the cost of any view is just one click.

Look into outliers. Outliers may be just bad data, or they may be interesting. A good place to find them is in scatterplots. “Most of my interesting discoveries are in scatterplots,” says Dan. Seemingly unrelated numbers sometimes have some kind of interesting correlation.

Combine. Put all the charts done so far into one dashboard. Filter all the views based on [things I highlight]. There you can see it all at once. Brains don’t remember more than one or two things at one time, but here you see it all together.

Repeat. Good tools make false steps easy to back out of.

Keep an open mind. He plays music, often the piano music of Frank Kimbrough, such as”The Spins.” He emails, “The lyrical and circular notions of this song reflect how I do analysis. He circles, he dives, he riffs, and then he comes back and does it again in a slightly different way.”

Present and persuade. Jazz, right-brain thinking, motivation, surprise, discovery — it all results in discoveries that must be communicated persuasively for any value to result. Dan recommends the two books by Dan and Chip Heath, Made to Stick and Switch.

Three hours of analysis will show you plenty. “You’ll know just as much as the insiders know.”

Do you have a routine for analyzing unfamiliar data? I’d especially like to hear from users of many different tools, from the most advanced to pencil-and-paper. Please introduce yourself here.