CATEGORY: Data Mining and Visualization
REVIEWER: Lelia Morrill, data mining architect for Retrograde Data Systems.
BACKGROUND: Retrograde Data Systems has been involved in data mining consulting since 1994. This includes building business mining analyses and applications for clients. The company has worked with several large consulting firms and hardware vendors to promote and establish data mining methodologies, services and architectures. The data mining project work referenced in this review was completed for a leading provider of online information.
PLATFORMS: Retrograde Data Systems worked with NCR consultants in their data mining lab in Rancho Bernardo, California, loading the client's customer warehouse onto a four-node 4700 NCR WorldMark server running NCR's Teradata database V2R4.01 on UNIX.
PROBLEM SOLVED: We typically start our client data mining services with a pre-analysis phase to understand the quality of data before committing to a corporate-level data mining project. This process is always labor-intensive, time-consuming and difficult. We worked with the NCR data mining lab to load our client's warehouse into the Teradata database. Once there, we used TeraMiner Stats to quickly assess the quality, richness and historical completeness of the data within the warehouse. Our usual cumbersome process of moving data in and out of warehouse and statistical environments was unnecessary, making the process much faster and more efficient. Our client's data contained limited, basic usage information. Because TeraMiner Stats takes advantage of the parallel features of Teradata, our iterative experimentation with the raw data was simpler and more straightforward than usual. TeraMiner assisted us in the intelligent selection of data as well as the derivation of more than 15 rich metrics that were used in our subsequent clustering analysis, all completed in a fraction of the usual time.
PRODUCT FUNCTIONALITY: TeraMiner Stats generates SQL that performs comprehensive statistical data analysis. Because Teradata was originally built for analytical processing, it lends itself to analyzing large amounts of data faster than processing in standard RDBMSs and statistical tools. There are 46 functions available to assess data inside of the warehouse.
STRENGTHS: Without having a great deal of SQL expertise, we were able to leverage the power of Teradata through the TeraMiner Stats interface. We checked the SQL once generated, and the results of the data analysis were available to us within the customer warehouse. The data was prepped and ready for our chosen mining tool.
WEAKNESSES: TeraMiner Stats was fast, convenient and powerful, so it was a letdown to move the analytic model development into another tool and its environment. We are anxious to try the beta version of TeraMiner Analytics. This will allow us to complete our cluster analysis directly in the database.
SELECTION CRITERIA: Retrograde faced a caveat with a potential client project we were unfamiliar with their data and hesitant to commit to an enterprise- level mining project without knowing if the data was of the quality required for mining. We jumped at the opportunity to evaluate the data through TeraMiner Stats at the NCR data mining lab. We quickly determined that a mining project was feasible, and we used the results from the TeraMiner Stats pre-analysis to justify and win the much larger mining project.
DELIVERABLES: TeraMiner Stats provided a fast, comprehensive data assessment for real-world, corporate customer data mining. The 46 functions available enabled us to rapidly understand, describe, transform and prepare our client data for subsequent mining, leveraging the Teradata database's parallel warehouse infrastructure. This is the most complete integration of enterprise-level warehousing and mining that we have encountered.
VENDOR SUPPORT: NCR worked with us closely, providing mining and technical expertise at their data mining lab. We were able to load and assess the data within days, giving us an enormous advantage in the overall project time frame.
DOCUMENTATION: The training and documentation for TeraMiner, which included a user's guide, a programmer's guide and online help, were ample and reliable.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access