CATEGORY: Data Mining & Visualization

REVIEWER: Randall S. Collica, senior business analyst, HP Enterprise Systems Group Americas CRM Operations.

BACKGROUND: HP is a leading provider of products, technologies, solutions and services to consumers and business. The company's offerings span IT infrastructure, personal computing and access devices, global services, and imaging and printing. Our $4 billion annual R&D investment fuels the invention of products, solutions and new technologies so that we can better serve customers and enter new markets. We invent, engineer and deliver technology solutions that drive business value, create social value and improve the lives of our customers. The new HP is the result of a May 3, 2002, merger with Compaq Computer Corporation, with combined revenue of approximately $81.7 billion in fiscal 2001 and operations in more than 160 countries.

PLATFORMS: Compaq Tru64 UNIX and Microsoft Windows 2000. Hardware: Compaq SP700 series workstations running two CPU Pentium III processors with 750M of dynamic memory (SAS Enterprise Miner client or desktop), Compaq ES40 Alpha Server 2 CPU Alpha EV6 running 667MHz with a 350GB Raid Array. (SAS Enterprise Miner server.) The desktop platform was used in Enterprise Miner client mode for the SAS Text Miner application.

PROBLEM SOLVED: There are two fundamental areas in which SAS Text Miner has helped solve business problems relating to CRM applications at HP. First, our inside call and contact centers have telesales representatives who routinely enter notes into a Siebel application while conversing with customers or prospects. These notes, which contain large volumes of textual information, are usually free-form text and sometimes contain portions of e-mail messages. While it is very useful for the telesales representatives to review these notes in ongoing CRM communications, our internal reporting teams could not come up with an effective report or OLAP application to analyze these notes in collective form. SAS Text Miner was used to effectively partition notes around common themes, which could then be used for subsequent analysis. A second application was one in which the combination of product hierarchies from pre-merger Digital, Compaq and Tandem caused difficulties in analyzing customer purchases in like product groups. Because of differing previous hierarchies among the companies, it was difficult to find all of the products in past invoice histories. SAS Text Miner was used to combine the product descriptions at various hierarchies and then create a classification model to determine which product part number belonged to a logical product group such as desktops, high-end servers, portables and the like. There were 25 product classifications, and the classification of more than one million part numbers (SKUs) was completed with an accuracy of slightly more than 90 percent.

PRODUCT FUNCTIONALITY: The product is functioning well for us. Analysis time can be quite lengthy (several hours) depending on the number and size of documents being processed. We are currently investigating the use of SAS Text Miner for analyzing notes of product warranty information, which is also free-form text and in multiple languages.

STRENGTHS: One of the main strengths of the product is that text mining can be performed in the same environment as the other data mining tool sets in SAS Enterprise Miner. This is a tremendous advantage that enables us to perform ordinary clustering of customer information and then mine the textual data for each cluster segment found.

WEAKNESSES: While analyzing warranty data, we discovered that some of the key words used to describe the text clusters were not actually in those documents, but rather were similar phrases. SAS has since put in a hot fix to this problem. Although SAS Text Miner does give a graphical representation of hierarchical text clusters, further graphics that help explain the textual taxonomy would be helpful.

SELECTION CRITERIA: Our major reason for selecting this product was that we needed to analyze our textual data in the same environment as other data mining projects. Only SAS Text Miner could meet this need. Other vendors could only offer text mining packages separate from their data mining packages.

DELIVERABLES: The current output is a key word listing that describes the main topics of each text cluster found in the analysis. This is useful information as it then allows for the use of that cluster information for further processing such as predictive analysis.

VENDOR SUPPORT: SAS' tech support is unprecedented, and the staff is always friendly and helpful.

DOCUMENTATION: SAS Text Miner's documentation is very good. However, there are few examples given in the documentation. Further real-life text mining examples would be beneficial to the new user.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access