How Slack uses big data to grow its business
Famously born from the ashes of a failed video game venture, Slack launched in 2014 as a company with an enterprise collaboration tool. By the end of that year, it was deemed the fastest growing startup ever. Today, Slack has 6.8 million weekly active users — 5 million daily — of whom more than 1.5 million pay to use the messaging service.
One of the keys to Slack’s growth has been its focus on three critical areas:
- Making sure that the first adopters of Slack in an organization succeed and expand the tool’s presence within that enterprise.
- Motivating users of the free service to upgrade to the paid version.
- Ensuring that that collaboration platform remains useful to its biggest customers, including Capital One, IBM and NBC, where Slack deployments run into the thousands.
The way the company completes those tasks is by identifying the behaviors of successful customers and then coming up with ways to encourage the same behaviors in others. And behind that effort is the company’s cloud-based, big data platform and a team of highly skilled data engineers.
Slack hired its first data engineer in the summer of 2015, just as platform uptake was exploding, and Josh Wills, Slack’s head of data engineering, joined in the fall. Slack’s current data engineering team numbers about a dozen. There’s a similar headcount on the machine learning team and around 20 data scientists on its analytics team.
Those data experts sort through user behaviors to find the patterns most likely to result in greater usage and upgrades to paid services. “From a data perspective,” says Wills, “it's simply about providing the infrastructure that allows both of those very disparate use cases to be optimized — to be done better and better using data.”
That data includes tracking how many new adopters of the free version of Slack upgrade to the paid version, the performance impact of a new release or feature and the first-24-hour behaviors of customers using Slack for the first time — such as how many people are invited to join, how many channels, or chatrooms, are created for various teams, and how many files are uploaded.
The team also looks at platform issues, trying to figure out how users decide to integrate other applications with Slack, and how they engage with the automated messages and tips posted to their channels by bots. And in terms of helping users get more out of Slack, the company has a dedicated Learning and Intelligence team that looks at which channels users interact with most, the order in which they read backlogged notifications and how they interact with search results.
Slack uses machine learning models to improve how it integrates data in its production system, and to learn from the way people use Slack. Learning from its users helps Slack improve user productivity by delivering more intelligent recommendations and better search results.
Big data feed
Most of Slack’s data comes from customers’ actual use of it collaboration platform, rather than from a disparate array of internal systems and external sources. Slack fields tens of billions of database queries from user per day, Wills says.
“There are other small data sources we use for standard things like geolocation, firmographic data, that kind of thing,” Wills notes, “but the vast majority of our data is from our platform.”
To harness its information resources, Slack relies on the Hadoop big data framework. Its data is housed in a cloud-based Amazon’s S3 web storage platform. Information from Slack’s application, servers and clients are routed to the S3 data warehouse by Apache Kafka, an open-source stream processing platform that handles real-time data feeds. Kafka, an Apache tool originally created by LinkedIn, is ubiquitous among San Francisco tech companies, Wills notes.
When data arrives at the S3 hub, an ETL (extract, transform, and load) process takes it from raw log format into Parquet, a columnar data format optimized for fast queries. Airflow, a program developed by Airbnb, helps manage the workflow.
Aiming to use the optimal tool for each task, Slack relies mainly on three primary querying tools: Hive, Spark and Presto. Apache’s Hive framework is best for working with larger datasets. The faster, robust Apache Spark framework is used for machine learning and any ETL pipeline involving particularly complex data formats. Presto, a distributed SQL query engine is used to quickly answer ad-hoc questions, work with smaller datasets, and create visualizations.
“On top of Presto, we have an internal dashboarding system we wrote called XY,” Wills adds. “It works very similarly to tools like Mode or Periscope, where you enter a SQL query and then you turn the results into a chart. And that's really what it does: SQL query, data, chart, dashboard.”
And the tool Slack uses to disseminate data?
“As you can imagine,” Wills says, “everything here is done through Slack. Everything.”
For Wills, the big task around big data breaks down to a simple equation: “I think the challenge that any company has in doing data right now—and I've felt this acutely for a long time—is time versus money,” he says. “At any given moment, I’m either wasting money because I’ve over-provisioned my Hadoop cluster, my data warehousing infrastructure, whatever systems I have, to have extra capacity online in case there's a burst of events…. And then the rest of the time, my clusters are maxed out.”
So either the company is paying for resources it’s not using, or its thriftiness becomes a bottleneck. Part of the solution, Wills says, is better tools, such as Dataflow, a programming model for batch and stream data processing, and Amazon's Athena, a querying tool that works directly with S3.
In meeting two of Slack’s big business goals —driving adoption/paid conversions and making the platform more effective for large teams—Wills says the big data effort has paid off.
On the first challenge, the company dove into the traits of successful and failed Slack deployments to learn what works best when someone introduces the platform into their organization. Key learnings include getting new users to quickly upload content—a file, a presentation, a link—to foster interaction on the platform. Also, more successful users are those who quickly adopt the mobile app and fill out their Slack profiles, the data showed.
These and other learnings have been incorporated into Slack’s design and onboarding messages to encourage these high-success behaviors. Wills says the team has helped improve metrics around new users, creation of new teams and integrations with other applications. His team uses A/B testing to identify potential improvements and measure their impact.
Wills offers a few examples of data-driven improvements, one which leverages the automated chatbot helper, Slackbot: “When you create a new Slack team, Slackbot offers suggestions for ways you can make the team successful based on the behaviors we’ve seen on other new successful teams,” he says. “We also experiment with free credits that enable the paid features of the platform for teams so they can try out the more advanced functionality to see how useful it can be.”
Data also helps Slack maintain its commitment to making rapid improvements to the platform. Perceiving a need for a media player option that would let users play shared media files right in the Slack application, rather than having to download them to a desktop, the data team looked at what kind of files were most shared (mp4 video and mp3 audio dominated) and planned the quickest way to add such a feature in a way that would be most useful to a significant number of users. Similar efforts helped prioritize and shape a PDF preview feature within the platform.
On the question of better serving enterprise-scale teams, there have been significant learnings, too. If early success is dependent on getting people more involved, large-scale success involves helping people step back from the noise generated by hundreds or thousands of users across scores of Slack channels. For example, Slack looked at usage patterns and started recommending relevant team channels to specific users. With these suggestions, users aren’t overwhelmed by communications from less integral teams within a large organization. The resulting channel suggestions feature was popular from the start: Initial testing with 10 percent of teams drew a 22 percent click-through rate for the recommendations—a successful start that is continually refined.
Cutting through the noise is a big challenge, Willis says. Not only in terms of finding relevant groups to communicate with, but to dig out the information needed right now, that is only half-remember from a week-old conversation thread.
“And so, one of our most recent efforts was dramatically improving the quality of our search results.” Results can be toggled by recency or relevancy, but the machine learning algorithms have added a “top hits” feature to better balance those two extremes. The machine learning further refines what those top hits might be based on usage, to more quickly and reliably surface relevant information. Specifically, Slack has seen a 9 percent increase in searches that resulted in clicks, and a 27 percent hike in clicks on the top search result—both indicating greater accuracy and usefulness of search results for users.
As for where Wills and the big data team are headed, he has a somewhat surprising answer to Slack’s overall challenge of making knowledge workers more productive.
“It's somewhat counter-intuitive, but one of the most important things for us to do is to find ways for people to use our product less,” he says. That means, he says, that Slack would remain a robust communication platform, but one that people can use with maximum efficiency, then get back to work.
Sounds simple, but he admits that he may be biting off more than anyone can chew. “Nobody really understands productivity. Economists have been struggling with this for decades,” he says. “And so, the big hairy audacious goal for me is to really, really understand productivity, really understand how all of this software, how all of these tools, can make us more productive.”