REVIEWER: Greg Sitzman, business intelligence director at

BACKGROUND: As the online arm of KB Toys, the nation's largest combined mall-based and online specialty toy retailer, offers a huge selection of popular toys, hard-to-find collectible toys and just-released video games. launched the new eToys Web site ( in October 2001 with a wide variety of nationally advertised toys, learning toys, video games and consoles.

PLATFORMS: Data Junction is being used in a Sun Solaris operating environment.

PROBLEM SOLVED: At, we use a variety of ERP systems to manage information throughout our business operations. Information from all of these systems, such as customer data, order product data, financial information and Web traffic, is aggregated into our central data warehouse supporting our B2C operations and business reporting. This process requires integrating large volumes of data from a variety of disparate systems into an Oracle database. We needed to replace our current integration tool with a solution that would deliver improvements in load performance as well as provide more flexibility in how we manipulate data in the ETL process. We also wanted an integration tool that would provide a much lower cost of ownership than the product we had been using. Data Junction met all our requirements in functionality and exceeded our expectations in mapping and parallel processing capabilities. By deploying Data Junction's multithreaded engine for SUN Solaris, our data loading time has been noticeably reduced, and Data Junction's event-driven programming environment has given us much more flexibility in data manipulation. Data Junction has provided us with a robust, feature-rich solution while delivering a very low total cost of ownership.

PRODUCT FUNCTIONALITY: Data Junction has delivered excellent performance. We initially ran 8 to 10 mappings through Data Junction Integration Engine and realized a performance gain of approximately 200 rows per second faster than our previous solution. In addition, when we ran those maps as a parallel process, the performance increase held true with no degradation. In other words, we achieved a true linear throughput gain when processing in parallel across eight CPUs. We will be expanding our Data Junction mappings significantly into several hundred in the coming months, and the volume of data being processed through Data Junction will increase up to 30 million rows of data per day by the end of the year. We are also considering using Data Junction's Content Extractor for parsing our Web traffic log to see if that will bring improvements over our current process.

STRENGTHS: The scalable parallel processing capabilities of Data Junction are important for our data loading process because we move very large volumes of data. Data Junction's event-driven environment also provides our developers with a much more flexible integration tool that provides unlimited capability to manipulate data in the mapping process.

WEAKNESSES: We rely heavily on execution reports when we run ETL processes to monitor conversions. While it is possible to write a progress callback routine in Data Junction to monitor a transformation process, it would be nice to have that real-time functionality provided out of the box. This capability is being promised in the next product release of Data Junction.

SELECTION CRITERIA: A major consideration in our selection of Data Junction was the combination of robust and flexible integration capability and an event-driven development environment with a very reasonable cost of ownership. We conducted a thorough evaluation of Data Junction which not only met our functionality requirements, but also exceeded our expectations in performance.

DELIVERABLES: Data Junction has given us the capability to process 20 million to 40 million rows of data per day and an easier method for moving data on a daily and hourly basis. Data Junction helps build the critical operational data store that we use to run business each day. It will continue to be a key component in our business process as we move forward with internal customer analysis.

VENDOR SUPPORT: Overall, the support has been good. Data Junction's help desk has been able to respond with solutions to the questions we've had.

DOCUMENTATION: Data Junction does a good job of documenting the functionality of its solution. However, we would like to see more documentation or examples devoted to an entire application process.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access