How GoDaddy powers its team with big data analytics

Published
  • April 05 2017, 2:45pm EDT

GoDaddy, with its web hosting and Internet domain name registration businesses, ingests on a daily basis more than 13 terabytes of new, uncompressed data, everything from Web site traffic and usage metrics to server management and customers’ e-commerce statistics.

GoDaddy’s staff uses all that data to configure products and provide client services to its 14.7 million customers, which range from major corporations to mom-and-pop shops. The company’s in-house staff also taps the data to conduct analyses of its operations in order to better meet customers’ needs.

However, until a few years ago, when GoDaddy’s staff wanted to dig a little deeper into its operations, they often dumped data into Microsoft Excel spreadsheets. GoDaddy, however, wanted to give them something better, according to Sharon Graves, a systems administrator at GoDaddy.

Today, the company is delivering online self-service analytics with Tableau as the primary visualization tool.

GoDaddy pulls in 13 terabytes of data a day

“By creating a self-service environment,” Graves says, “GoDaddy product managers and business users can leverage data to create a better customer experience, and find and design the product that will meet their needs by identifying trends and anticipating issues.”

Expanding on Big Data

GoDaddy was founded in 1997 as Jomax Technologies, and launched its first web site a year later. By 2005 the company, which was renamed GoDaddy in 1999, had 2 million customers, 10 million domain registrations and $100 million in annual revenue. The company went public in 2015, when it had reached 4 million international customers. By 2016 GoDaddy had achieved $1 billion in domains bookings and $1.8 billion in annual revenue.

Also See: How Big Data Helps the Smallest Babies

GoDaddy, according to Graves, has a big data platform that includes Hadoop Hive, a data warehouse system for query and analysis; Microsoft SQL, MySQL and Teradata relational database management systems; Cassandra, an open-source distributed database management system; Apache Pig, a data analytics processing engine; Apache Spark, a processing engine for cluster-computing environments; and third-party online analytics tools such as Google Analytics.

To arm its staff with something better than spreadsheets with which to look at its data, GoDaddy, in 2013, implemented Tableau Server, an enterprise-wide visual analytics platform, first for its business intelligence team and then for other users within the organization.

These power users, most with some query knowledge and at least some exposure to analytic measures, were the first to serve up their own, somewhat limited, analytics.

But with so much data coming into the organization every day, GoDaddy had partial documentation on the origins of the data, its usage and how some of the calculated fields might have been derived.

“Our power users who may not be intimately familiar with the data didn’t know where to find the appropriate data for their analytics, or, if they did know where it was, they were not sure how to use it to meet their needs,” Graves says.

Analysts were spending much of their time trying to figure out which fields in which data sources they should use, and not enough time conducting actual, meaningful analysis.

To help its product managers, business users, data scientists and other data consumers gain better insights, GoDaddy ramped up its effort. It gave 1,400 users access to Tableau and, at the same time, deployed Alation’s Data Catalog, which reinforce data governance for self-service analytics at scale and allows users to find better insights without the need for intervention from the technical staff. Alation uses machine-learning algorithms to automatically inventory data and enrich it with the context necessary to find, understand and trust the data—creating a single source of reference for an organization's data. Alation runs on a Linux server and through machine learning “profiles” connected data platforms to profile all of GoDaddy’s data.

“By putting data in our end users’ hands, they then had the ability to quickly pull together their own base-level reporting,” Graves says. “These individuals were closest to the products/application changes and could quickly identify where something may need adjustment.”

Their feedback is relayed to the BI team for a deeper look.

“From there our data stewards can add additional details to provide a full picture of the data, its origins and its intended usages, as well as provide samples of how our power users are utilizing the data,” she says. It enables data stewards to comment on particular data sources, letting other analysts know factors such as which data source is most popular and which analyst within GoDaddy’s huge organization knows the data best, Graves says.

“Much of the metadata is captured automatically using machine learning in Alation, scanning query log files and profiling data on GoDaddy’s servers,” Graves says.

The Alation tool profiles Tableau as well, looking at the various reports and data sources within the platform. This allows users to search for existing reporting that might meet their needs. “Together, Alation and Tableau empowered GoDaddy's IT team to … catalog and tag data, to examine the lineage of a table, to search multiple data sources for a field, and to increase visibility and control.”

One example of how users are leveraging this data is monitoring and understanding how changes to web page design and workflows impact customer experience. “If we see shoppers dropping out of the process at a given point, we can go in and relook at that flow to see if there’s a better approach,” Graves says. Another example the ability to look at email campaigns to see if shoppers are responding to them.

The new setup is a big improvement over GoDaddy’s previous environment, which ran over multiple platforms with no clearly defined system of record. Data was replicated to multiple locations with varying business rules for processing.

With GoDaddy’s explosive growth and acquisitions over the past 20 years, including 10 acquisitions in the last three years – including web services provider Host Europe Group – there has been a need to ingest data quickly to allow for proactive analytics. “This did not always allow for best practices to be followed,” Graves says. “We had data on multiple platforms, in various structures, containing custom business rules. Because of this, our BI [team] spent a lot of time in data management as opposed to analytics.

“We needed to centralize our data, define clear lineage and ensure our end users had appropriate access,” she says.

Last Piece of the Puzzle

Because GoDaddy’s internal users can now conduct much of their own standard reporting, the company’s BI experts have more time available to perform deep dives into the data and create further business value. For example, if there is a sudden traffic spike, they can find out whether that spike is valid, what caused it, whether there is a problem on the network or the site or whether it was simply a great day for web site traffic, Graves says.

“Our end users are producing their own monthly monitoring metrics: orders, revenue etc.,” Graves says. “Our BI analysts may be asked to look into a recent Web site traffic spike to ensure its accuracy and cause. This analysis may require joining a number of data sources to deep dive into the cause.”

But GoDaddy isn’t done. The company continues to look for new ways to leverage big data and analytics capabilities. “We are currently working on stabilizing our environment, ensuring best practice implementation, enforcing data management,” Graves says. “Once this is a bit more stable, we will start to look at other end points such as whether we can add some additional reporting tools and [provide] them to our external customers. We’re always looking at ways to make our customers more successful, knowing that big data plays a significant role in this effort.”

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access

Bob Violino

Bob Violino

Bob Violino is a freelance technology and business writer who covers a variety of topics, including big data and analytics, cloud computing, information security and mobile technology.