CATEGORY: Data Quality

REVIEWER: Don Waskiewicz, chief information officer for InfoLab Inc.

BACKGROUND: InfoLab, founded in 1994, specializes in marketing solutions that leverage information to increase the profitability of our clients' marketing efforts. This is accomplished through the use of customer and third-party data as well as a variety of enabling technologies. Our clients are primarily Fortune 100 companies and have large databases of customer records. These clients serve consumers in the automotive, insurance, financial and retail industries; and their files range from four million to more than 30 million customer records.

PLATFORMS: Our platform consists of 14 servers running Windows 2000, using approximately five terabytes of RAID storage for housing our clients' databases and processing data prior to entry into a database.

PROBLEM SOLVED: I was working with a national customer database that contained approximately 10 million consumer records, and I needed to have a tool that could sort and reformat these big files. By utilizing SyncSort for Windows/NT, I was able to meet the challenge of processing this large database. SyncSort successfully sorted and reformatted the files, which were larger than 4GB, without having to split them into multiple passes.

PRODUCT FUNCTIONALITY: SyncSort is one of our main tools for doing general data processing. We're continually finding new things to do with it. For instance, in a recent job I discovered that each record in a file of 11 million had an extra byte in it. I used SyncSort to reformat each record, and within 10 minutes it was finished. Prior to that, I would have had to use a product such as SAS. Because I don't program SAS, I would have needed a programmer for this task. SyncSort eliminated the involvement of a programmer, saving much time.

STRENGTHS: SyncSort has allowed me to replicate mainframe-type processing on the NT platform with a small staff of people. It's an easy-to-use tool that enables me to transfer operations that previously would have required writing complex programs. I've been very pleased to find that I can now translate processes that I used to perform in COBOL on an NT platform and on a large scale. So many of the tools that I've used in NT have constraints such as two- or four- gigabyte file size limits. I don't have that with SyncSort, which is a real benefit.

WEAKNESSES: A feature of SyncSort enables you to run a sample, or test set, when you do a sort or a copy. Before you sort or reformat 100 million records, you can set it to run 1,000 records and stop. This functionality would be really nice with the join feature as well. I understand this is currently under consideration at Syncsort.

SELECTION CRITERIA:In the early 1980s, I was responsible for completing similar projects while working as a programmer on the mainframe and used SyncSort, a high-performance data sort and manipulation product. We wrote many programs with SyncSort where we would pre-process and post-process data. I hoped Syncsort had developed something similar for use on an NT platform. When I investigated, I found SyncSort for Windows/NT.

DELIVERABLES: In our business, our clients count on us to solve whatever problems arise. These are things we can do only because we have SyncSort. It is one of the tools that enables us to stay in business and continue to grow. Whether it's as an integral component, part of debugging, ad hoc type processing or quality checking, everything we do includes SyncSort.

VENDOR SUPPORT:I was also working on a project where I needed to perform merge/purge operations. We were required to merge data from different sources, processing matches one way and non-matches another way. I started testing some of the merge/purge tools in the marketplace but wasn't happy with any of them. Syncsort was just releasing a product with the join feature. When I received the release, Syncsort's tech support team taught me how to do joins, essentially allowing me to do merge/purge operations.

DOCUMENTATION: The documentation is very good and meets all our needs.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access