AUG 26, 2008 2:51pm ET

Related Links

Birst Automates Connections to Big Data
February 8, 2012
The Data Behind Red Cross Donations
February 6, 2012
UBS Taps Big Data to Shrink Reputational Risk
February 6, 2012

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
Business Insight at Your Fingertips: Bringing Analytics to the Masses
March 22, 2012
Best Practices in Delivering Big Data Analytics
Available On Demand

The Velocity of eBay

Print
Reprints
Email

SAN DIEGO - Of all the lessons and presentations offered at last week’s well-attended conference held by The Data Warehouse Institute, none generated more buzz than TDWI’s Executive Summit session with Oliver Ratzesberger, senior director, Architecture & Operations at eBay Inc.

 

What Ratzesberger offered the audience was a glimpse at the future, reflected through the internal requirements of eBay, a corporation that does not allow six, 12 or 18 month data projects or dwell excessively on what has gone before. What dictates this policy is purely a reflection of velocity at the massive online reseller: an automobile sold every minute, a diamond ring sold every two minutes, more than three watches and five women’s handbags sold every minute, on and on across 50,000 categories of goods. 

 

As data architects know, there’s scale, and then there’s scale. At eBay, there is yet another dimension. No less than 5,000 business users and analysts turn over a terabyte of data every eight seconds. eBay inputs 40 terabytes of new incremental data every day and processes 25 petabytes in the same 24 hours. This happens all day, every day, each the sum of millions of queries parsed with more than 99.9 percent availability in near real time.

 

This necessarily precludes some traditional data warehouse practices (which we’ll get to), yet eBay is most certainly an analytics driven business, considering that it hosts approximately 113 million listings worldwide at any given time and adds 6.7 million per day.

 

“Analytics are in our DNA from the bottom-up and from the top down,” Ratzesberger told the audience. So are KPIs, which he described as the metrics used to measure teams and individuals and tie to compensation.

 

Fair enough, but at eBay, KPIs also roll up into trees and bigger trees full of subtrees that look at multiple organizational metrics of visitors, engagement, buyer and seller retention as well as the various formats available to auctioneers. They are meant to align individual and departmental performance objectives with corporate goals. The comprehensiveness of this would certainly be a topic all by itself.

 

Consider eBay’s technology operations, where a specific KPI measures the efficiency of distributing large workloads over pools of tens of thousands of servers. A simplified KPI of parallel efficiency states that, while 100 percent efficiency is good, less than 70 percent efficiency is bad. By raising parallel efficiency from 50 percent to 80 percent, eBay can realize millions of dollars saved in operational spending.

 

Beyond standard measures, Ratzesberger offered up a dozen kinds of analytics as just a sample of an attitude that is open to measuring pretty much everything possible. Eighty-five percent of eBay’s analytical workload is new and unknown, meaning that exploration is at the core of its delivery philosophy.

 

“The metrics you know are cheap,” he said. “The metrics you don’t know are expensive but also high in potential ROI.” As a result, design can’t be static or dependent on specific questions or dimensions. It calls for a decentralized model that doesn’t hinge on project TCO, the multiple databases, inconsistencies, complexities and redundancies of data marts. At eBay, says Ratzesberger, “A data mart cannot be cheap enough to justify its existence.”

 

Take that … uh, everybody. eBay thinks differently if for no other reason than it must and has summoned the means to do so.

 

The alternative lies in massive scale analytical utility computing across thousands of boxes where users and analysts bring their own data and perform their own analytics, a prototyping environment or “sandbox” accessed through a Web portal.

 

The goals include improved time to market (days vs. months) through quick and agile prototyping that allows users to “fail fast” and make it easy to try new ideas without the burden of long timelines and dedicated programs. Even the mantra of data quality does not precede inquiry at eBay, since bad data is assumed and can be dealt with as the project is delivered. It is an attitude that seems to say “let’s move on,” which is the general impression I got of life at eBay.

Filed under:

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.