CATEGORY: Data Acquisition, Transformation and Replication
REVIEWER: Robin Whyte, lead consultant for LG01.
BACKGROUND: LG01 is a U.K. consultancy specializing in data warehousing solutions. Although relatively new to the market, we are experienced developers with a deep understanding of the technical issues associated with data movement and management in large enterprises. We are based in York, Northern England.
PLATFORMS: CoSORT runs on a Sun E3000 server (Solaris 7) with four CPUs. A mixture of local and network (4 x 100MB/s) drives were used.
PROBLEM SOLVED: During our implementation of a large scale data warehouse for a major automotive company, we ran into a number of problems. The client had originally stipulated that we query directly against their operation database from the ETL tool, including complex joins, sorts and filters. We were then to "twist" the data using the ETL tool to provide both the warehouse and a number of report-ready denormalized tables suitable for paper and Web reporting. The first problem was one of speed. The extraction from source to ETL tool was taking far too long and placing unacceptable strain on the operational servers. The second issue was one of complexity. The ETL tool, although well suited to simple transforms, soon required hand scripting to do anything even modestly complex. We would have been forced to hard code much of the logic due to the tool's poor parameterization capabilities. By adopting a radically streamlined method, using CoSORT to process flat file dumps of the operational data (very quick to extract), we could drastically cut our run times. We could do all the joins and filters we needed on a separate server; and because of CoSORT's ability to run in parallel across all CPUs using all the RAM, we got an excellent return on our client's hardware investment.
PRODUCT FUNCTIONALITY: As an additional bonus, we found that we could pipe CoSORT's output straight through some simple Perl scripts to do our transformations a great solution as Perl gave us genericism across operating systems and the ability to parameterize the business logic and avoid hard coding. The ETL tool simply became a method of defining the process flow, with all the work being done under the hood by CoSORT. LG01 now recommends CoSORT to all its clients as a means of achieving high speed, high volume warehousing; it has become a highly prized and reliable part of our toolset.
STRENGTHS: The ability to perform flat file SQL-like processing, particularly joins, is very useful indeed; and the speed with which it executes makes it a must-have tool, in our opinion. The script language is rich and appears to carry no cross-OS differences.
WEAKNESSES: CoSORT, as used, had no GUI to speak of but I understand a Java one is available or imminent. Joins across more than two tables require some extra work within the scripts.
SELECTION CRITERIA: Having evaluated a number of similar products, none seem to carry the functionality that CoSORT does. This CoSORT tool encompasses 70 percent of the functionality for only 20 percent of the cost, which made the decision an easy one.
DELIVERABLES: We delivered an architecture that delighted the client. It was under budget and ran well within the time frames required.
VENDOR SUPPORT: CoSORT/IRI has proven to be very quick and reactive. They seem genuinely eager to assist in any way they can. During evaluation, they offered excellent support and advice. A U.K. support base would be useful if only to avoid the time difference issues, but CoSORT is a straightforward tool to use as it is.
DOCUMENTATION: The PDF documentation we received was very good.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access