CATEGORY: Data Acquisition, Transformation and Replication

REVIEWER: Henri Asseily, CTO and co-founder of

BACKGROUND: Founded in 1996 and based in Los Angeles, is the number one retail mall on the Web (Media Metrix, September 2000), connecting millions of buyers to thousands of sellers in a single, organized location. Designed as the ultimate comparison-shopping hub, the site provides the best efficiencies for both buyers and sellers. Through's unbiased consumer rating system, compiled exclusively from point-of-sale and fulfillment surveys of participating merchants, shoppers can compare merchants across 10 service dimensions.

PLATFORMS: Application console running on Microsoft Windows NT server upgrade to Sun Solaris Server. Product used on NT/SQL Server, Solaris/Sybase, Solaris/Oracle, Linux/Sybase, Linux/Oracle.

PROBLEM SOLVED: As's business has expanded at a 200 percent annual growth rate over the past four years, so has its need for a robust data warehousing solution to support its business processes. During 1999 alone, the volume of raw data processed grew twentyfold. We needed to build a data management solution that could convert survey information from online shoppers into valuable market research data and deliver those results to online merchants, all at Web speed. To meet this challenge, we chose DataStage and went to work designing an information architecture, developing a meta data plan for our processes and creating a software architecture plan for the extraction, transformation, loading and relating of jobs that had previously been hand coded. With the implementation of DataStage, we have a better turnaround time. Mission-critical jobs that once took eight hours and required writing a PERL script now take only twenty minutes using DataStage.

PRODUCT FUNCTIONALITY: Today DataStage runs all of our data movement processes and has replaced all of the older PERL scripts and stored procedures used in the past. DataStage has been highly successful at helping us efficiently run over 600 different jobs that support our business operations. Before DataStage, we ran a daily batch process that took all night and part of the next day. If there was a breakdown in the process, it was difficult to catch up during the day. And, if those jobs did not run, we could not update the merchant ratings on our site and push information out to our partners via our extranet. DataStage has ensured that can meet its data processing requirements on time.

STRENGTHS: We are very pleased with the meta data management capabilities, the flexibility and the power of DataStage. As expected, DataStage is database-agnostic, as we successfully run it to interact with MS SQL Server, Sybase and Oracle on NT, Solaris and Linux. This extraction, transformation and loading (ETL) solution integrates very well within a DBA's toolbox.

WEAKNESSES: DataStage is lacking in its automated error-handling and recovery features. It provides no way to automatically restart jobs, time out zombie jobs or kill locking processes in a 24x7 unattended environment. However, on the operator level, all these problems can be easily resolved through the graphical user interface (GUI).

SELECTION CRITERIA: Because built its reputation on delivering product data in a timely fashion, it was crucial that we build a data management architecture that supported our need to rapidly process enormous amounts of data on a nightly basis without sacrificing the integrity of our ratings. DataStage provided us with the speed and accuracy we were looking for; and its scalability allowed us to store, analyze and compile the millions of records we receive daily into reports that are sent to online merchants. We also wanted to have our data management architecture in place as soon as possible. We were looking for an off-the-shelf solution that would easily assimilate into our system, as well as a solution with a short implementation cycle that was easy to learn, yet powerful enough to handle our enormous amounts of data. DataStage allows us to do low-level programming as well as high-level graphical prototyping, without sacrificing flexibility.

DELIVERABLES: On the input side, the GUI is very user-friendly. On the output side, Data-Stage has solid reporting and logging features that help to instantly diagnose problems.

VENDOR SUPPORT: From the beginning, we worked very closely with the DataStage product team that listened to our business requirements and worked to ensure that they were met. This has been a satisfying relationship, and we expect the partnership to continue as completes its implementation of Axielle.

DOCUMENTATION: All of the written documentation, both printed and online, is very well written and easy to understand.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access