AUG 21, 2007 1:00am ET

Related Links

An Agile Approach to Databases
May 22, 2013
Five Secrets to Kick Off an Analytic Project
May 22, 2013
IBM’s Watson to Face Biggest Human Challenge Yet: the Customer
May 21, 2013

Web Seminars

IBM & Teradata Compared: A Total Cost of Ownership Study
May 22, 2013
What Is Data Science? You Might Be Surprised!
June 3, 2013
AARP: Embracing Dynamic, Agile Analytics Platforms for Big Data
June 5, 2013

Complex Joins and Lookups Now Run Outside a DBMS

Print
Reprints
Email

Innovative Routines International Inc. (IRI), makers of the CoSort (www.cosort.com) data processing software for UNIX, Linux and Windows, announced two more IT productivity breakthroughs in Version 9 - multifile joins and multidimensional lookups . These new functions integrate disparate data, and create newly actionable information.

By defining intersections in database extracts and legacy files, CoSort users can simultaneously discover, transform, and report on related data. And by performing join and lookup functions on flat files, CoSort users can: 1) relieve the DBMS of query overhead; and, 2) incorporate mainframe/index file, spreadsheet and other data into the process.

Multifile Joins

Joining large tables to satisfy queries taxes DBMS performance. There has also been no efficient way to compare large files and identify field changes (inserts, updates, deletes) over time. "In addition to offloading DBMSs, multifile joins offload data integration tools, by merging data before it hits the tool," said Philip Russom, senior manager at The Data Warehousing Institute. "At the high end, this is useful with the distributed architectures that many users apply to scaling up their data integration solutions. At the other extreme, multi-file joins may eliminate the need for a data integration tool."

Multidimensional File Lookups

Data cleansing, multitable joins and complex computations that produce discrete solutions are resource-intensive operations. Where a simple lookup can replace a runtime computation (e.g., mathematic expression or pseudonymization), the performance gain is significant because retrieving a value in memory is faster than computing that value. To achieve these fast retrievals, CoSort users specify lookups against set files. By referencing multicolumn files, users get faster answers to discrete questions like the right ZIP code for a city in a state lookup. Russom added that "when multicolumn files are sources for a data warehouse, multidimensional file lookups can generate cubes and other multidimensional structures for the warehouse and analysis tools."

This piece is brought to you by the Information Management editorial staff.

Filed under:

Advertisement

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Where do young IT professionals (30 and under) obtain information to aid with daily role responsibilities and career development?

Trade publication websites 14%
Social media 23%
Vendor websites 4%
Vendor/community forums 7%
Newsletters 1%
Trade conferences/meetups 2%
RSS feeds 6%
Web search 44%

 

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.