P>BACKGROUND: Carfax is the nation's leading online information resource for buyers and sellers of used cars. The Carfax Vehicle History Service is the only comprehensive nationwide title search database and contains over 1.2 billion records on more than 300 million vehicles. Simply by supplying the 17- character vehicle identification number (VIN), consumers can pull up a detailed vehicle history report via the Web on virtually any used car or light truck in seconds. A Carfax report can uncover hidden problems such as junk, salvage, flood damage and manufacturer buyback titles that have been "washed" from the vehicle's paperwork as well as odometer fraud. Consumers may also receive the assurance of a clean title history guarantee. PLATFORMS: Data Junction 7.0 and the embeddable DJEngine are running on Windows NT 4.0.

PROBLEM SOLVED: Carfax's comprehensive database contains more than a billion records and continues to grow rapidly with the regular addition of new information sources throughout the United States and Canada. Data submissions are sent regularly via multiple protocols from more than 150 sources including motor vehicle departments, emissions inspections stations, auto auctions, fire and police departments, rental/fleet vehicle companies and extended warranty companies. Files are submitted in a wide variety of disparate formats and sometimes contain multiple tables or hierarchical structures requiring many-to- one or one-to-many transformations, validation and cleansing before they can be integrated into the Carfax Web application. We needed a robust integration tool that would enable us to accept all these files in their native formats and automatically transform and cleanse the data for regular updating of our online vehicle history database. Prior to implementing Data Junction, the addition of each new source required a month of coding. With Data Junction, our IT staff creates complete transformation routines in just a few hours.

PRODUCT FUNCTIONALITY: Data Junction has greatly simplified and accelerated the cleansing, validation and transformation of large amounts of critical data. We process, on average, between 48 and 62 million records a month. These source records may range from 512 bytes with 20 to 30 fields of information, to 4,000 bytes containing 300 to 400 fields of information, and files can be as large as a gigabyte. Our data providers submit files in a variety of formats including Goldmine, ASCII ACCESS, dBASE, SQL Server, Oracle, SAS and others. With the arrival of a new source file, our data integration staff employ Data Junction to quickly analyze the data and reconcile differences in structure and format between the source and our online database. A script was developed around DJEngine to automate the execution of these user- designed transformations. When source data arrives, DJEngine automatically begins to restructure, cleanse and validate data that is then passed on to Carfax's Web database.

STRENGTHS: Data Junction's ability to accept any source type and its expression-builder and lookup functionality facilitate quick analysis and cleansing of newly received data. The ability to script the DJEngine for automatic execution of data manipulation and validation speeds the integration of information to our Web database.

WEAKNESSES: The product has been excellent; however, we are looking forward to the increased functionality that will be provided in the upcoming release such as the ability to write to multiple file outputs in a single conversion and Data Junction's multithreaded engine. Data Junction's online knowledge base could be more comprehensive
and current.

SELECTION CRITERIA: We chose Data Junction because of its abilities to accept any source type and quickly flatten multilevel files. Data Junction's lookup functionality also made the decision to use this product simple.

DELIVERABLES: Data Junction saved us months of programming time and tens of thousands of dollars in resources, and it greatly accelerated the time to market with our Web database.

VENDOR SUPPORT: Data Junction provides outstanding vendor support. Our IT staff is always creating new expressions to extend the product's capability, and Data Junction's staff is eager to help. Data Junction's tech support is second to none.

DOCUMENTATION: The documentation could be a little more descriptive but overall was very helpful. In fact, we initially learned to use the product with only the help files.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access