The word got out last year: data scientist is the “sexiest job,” a late-2012 declaration by the renowned Tom Davenport of “Competing on Analytics” fame. Trouble is, “sexy” goes bad faster than fish.
“Data scientist,” still fresh, is my word of the year. In 2013, the data analysis industry discovered it, many loved or hated it, but most of all, we repeated it. Google Trends shows the mention of it soaring like the 1990s Dow Jones Industrial Average — and you know what happens next.
Alert as data scientists are to patterns, I wonder if many don’t shudder at the “sexy” label. If so, they might have had some comfort from a discussion around the big table at the Pacific Northwest BI Summit. There, calm conversation displaced the industry’s noise around the topic for nearly two hours last summer.
Consider the onerous job description. “A data scientist has to be a statistician, has to be a storyteller, they have to be a liaison, understand visualization tools, have to understand various modeling techniques and algorithms. The list goes on and on,” said discussion co-leader and SAS vice president of best practices Jill Dyché . She is also the co-author with Davenport on a 2013 report, “Big Data in Big Companies.” One person can’t master all of it.
Some of them can’t even go on a normal date. She found that out when her data scientist date was too busy to share a Sambuca. Read about that experience in her now-famous blog post “Why I wouldn’t have sex with a data scientist.”
Expectations hit data scientists from all sides. “One of the biggest errors executives make,” Dyché said, “is to bring in data scientists too early, before they understand where the gaps are. You can’t model data you can’t find. You can’t discover 'unknowns' until you understand the 'knowns.' And you can’t expect someone to recommend new business actions to people who don’t want to change.”
That’s more than data science. That’s organizational politics or even social work. Most people in other jobs just make the best of it. But at current, “sexy” prices, dysfunction is expensive.
If so much of the job is knowing the domain, knowing where the data’s hidden, and knowing how to pull the levers of power, the best data scientist may be homegrown. The right organizational structure can help.
Dyché described two basic structures. The functional structure embeds data scientists with the line of business. At one large bank, for example, two data scientists in marketing devote themselves to creating marketing campaigns. The data is shared across the organization, but the processes and tools belong to marketing. The shared-service structure finds more economies of scale. At a large entertainment provider, for example, a management science and integration group works across the whole organization and they know the customer in all their faces, from shopper to fan.
Each organization shapes the role according to its own type of business and peculiar needs. Some need heavy quants while others need knowledge of a particular subject. “At the end of the day,” said Dyché, “this has to be defined by the business, not by the industry.”
“We can’t go looking for data scientists like a hammer looking for a nail,” she said. It’s smarter to just fill the organization’s voids.
Speaking of voids, what’s the rest of the gang up to? Should all the poor schlubs on lower rungs sit around waiting for insight? Discussion co-leader Simon Arkell, CEO of Predixion Software, doesn’t think so. “The system is broken if we rely on [data scientists] for end-to-end service.”
There’s value to be found down below. Though you can’t replace a data scientist’s judgment and know-how with software — you wouldn’t dare rely on it for truly important decisions —it is possible to set up self-service that moves some functions down the ladder. The data scientist then works on the really difficult, complex problems.
The solution probably starts with getting rid of the familiar method of distribution: emailed PowerPoint or Excel files. Arkell saw it, of all unlikely places, at one global consulting firm. The organization’s hundreds of data scientists model data and extract results — and then paste them into PowerPoint for distribution. “This seems so 1985,” he said, “yet this big firm does this to this day.”
Collaboration finds real leverage, though, when a domain expert can put a data scientist’s pressure-tested models into production. Let the data scientist do the modeling, and let the domain expert apply it.
It’s when Arkell talks about medical applications that I relate especially well. Almost every day, I hear a snippet from an intensely busy labor-and-delivery unit at a Bay Area hospital: the blue newborn, the C-section, the drug-addicted mother. In an emergency, there’s no time for data modelers. As Arkell has found, any data has to be expressed in immediately useful terms for in-the-moment analysis.
Sexy as the title may be, and as beneficial in glamorizing data analysis, “data scientist” has also been a distraction and annoyance. The data scientist function “has been around since before the abacus,” said Lyzasoft founder Scott Davis, who also sat at the big round table at the BI Summit. “It [was made up] by someone who is selling something, either a book or a training regimen or a system or a degree or a tool or .” Or something.
Sexy is for fish when it’s fresh, high school football stars when they’re winning, musicians when they’ve got a hit. But data analysis should be a team sport in which everyone, everywhere, takes part, from the most-skilled to the least. Data analysis makes everyone sexy.