BACKGROUND: Two Crows Corporation is the leading market analysis and consulting firm focusing exclusively on the data mining and data warehousing market. Two Crows works with Fortune 1000 companies to help them develop data mining strategies, select products and implement data mining solutions and also works with vendors to help them understand their customer requirements.

PLATFORMS: Darwin runs in a Windows NT/95 client/server UNIX environment. UNIX servers include Sun Solaris and HP-UX with support for both single and multiple processor (SMP) environments.

PROBLEM SOLVED: In order to compete successfully, companies need to understand their customers so that they can effectively acquire new, profitable customers; keep good customers; and increase the revenue generated by existing customers. Because industry studies show it costs 5-6 times as much to acquire a new customer as it does to retain an existing customer, companies are turning to data mining to improve customer relationship management. Many businesses are supplementing their data with demographic information about who a customer is and psychographic information about a customer's behaviors and preferences to build models that increase their ability to accurately predict customer behavior. These tens of millions of records and hundreds of variables about each customer present a serious challenge for any analyst and data mining software.

PRODUCT FUNCTIONALITY: Darwin can access ASCII and RDBMS data (using ODBC). Darwin provides a familiar Windows NT/95 user interface including extensive wizards to guide users through the model building process. A wide range of data transformations is easily made. Darwin Release 3.5 supports three data mining algorithms (neural networks, classification and regression trees and k-nearest neighbor) based on the correct belief that there is no best algorithm and that a variety of algorithms is necessary to build accurate models. Darwin uses MS Excel for graphing data mining results and MS Internet Explorer for on-line help. Thinking Machines (TMC) has said that its forthcoming Release 4.0 will add clustering and naive Bayes algorithms.

STRENGTHS: Darwin has three key strengths. First, it is highly scalable and can mine large amounts of data because of the parallel implementation of its data mining algorithms. Second, models can easily be deployed as part of applications. This is especially important in building applications in which data mining is a fraction of the end-user value. The third strength is Darwin's greatly improved ease of use through a more intuitive Windows client. There are now wizards to guide the user through the model development process. Darwin's Model Seeker automatically builds multiple models and selects the best one(s) for the user to review. Workflow and scripting features provide a visual depiction of the data mining steps and automate the data mining process. Darwin has retained a high degree of control available to expert users through tuning options.

WEAKNESSES: Darwin lacks data visualization tools that help the user better understand the data prior to building data mining models. While external tools can meet this need, it would be nice to integrate it within Darwin. Also, while the addition of a workflow visualizer is welcome, future releases should allow the editing of the workflow in the visualizer.

SELECTION CRITERIA: Thinking Machines targets large, customer-focused companies who have vast amounts of customer data. Their strengths in solving these large, complex data mining problems should be considered by companies who want to build enterprise customer relationship management solutions. Oracle has selected TMC as a data mining partner.

DELIVERABLES: Darwin's Model Seeker and Key Fields wizards, interactive tree display, lift charts, sensitivity analysis, ROI and margin graphs, error tables and decision tree rules are very straightforward and readable. Models can be exported as C, C++ and Java code to deploy the models for customer scoring, campaign management and real-time "intelligent agents" integrated in other enterprise applications such as call centers.

VENDOR SUPPORT: Thinking Machines Corporation provides phone support and on-site training and professional services. We found TMC personnel to be knowledgeable and responsive.

DOCUMENTATION: Darwin's user manuals document all Darwin functionality but could be improved with better graphics and more examples. The on-line help was simple and easy to use.


Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access