PLATFORMS: The Clementine Data Mining System runs on UNIX and Windows NT or on a Windows 95 client connected to a server.

BACKGROUND: Clementine developer, Intregal Solutions Ltd., (ISL) is working with Somerfield Stores and the Parallel Applications Centre (PAC) on the MARKET project, partially funded by the European Commission. Initially, Somerfield will benefit directly from mining their store data. As the project continues, MARKET will develop a generic, on-screen demonstration of how data mining can be used in retail.

PROBLEM SOLVED: Among the Clementine projects at Somerfield is a study of bread-buying patterns. Using hourly totals of different breads sold, the MARKET team is trying to predict how bread sales react to various price promotions. Although there is a general weekly cycle, there is a large variation for any given day of the week. The situation is further complicated by a "domino" effect, where one type of bread sells out and the sales are transferred to other brands, which can lead to confusing results. Because bread has a short shelf life, it is important that accurate predictions can be made about how much stock to bring in or bake in store in order to obtain optimum freshness and availability. Work with Clementine has already generated new insights in this area.

PRODUCT FUNCTIONALITY: Clementine supports the complete data mining process. It contains facilities for pre-processing and cleansing data and can read data from any ODBC-compliant database or from flat files. The data visualization tools are interactive, so any interesting regions or clusters in a graph can be isolated and modeled individually. One unusual tool is the Web Diagram, which shows the strength of associations between different fields in the data. The wide range of modeling algorithms includes predictive neural networks, rule induction and regression. The association rule models identify segments and clusters automatically.

STRENGTHS: Clementine's innovative visual programming interface makes it very quick to test hunches and to generate new ideas. By reducing the need for technical expertise, a more direct and creative role can be taken by the data owners. Connecting to external databases is straightforward.

WEAKNESSES: While Clementine is easy to use, new users need to learn how to think in "data streams." These consist of a series of nodes, each containing a tool, that are linked on screen. Another problem is that, at present, several intermediary stages are used to get the source data into the right form for Clementine. Doing this processing within Clementine can get complicated, although ISL consultants are developing new "data streams" to make the process easier. This will result in a direct connection, eliminating the middle stages.

SELECTION CRITERIA: ISL has a strong background in data mining projects and developed Clementine as a response to the problems that their staff experienced. Their consultants are experienced data miners, who directly influence product development. Clementine was the first end-user data mining system in 1994, and ISL has been using feedback from its user base and consultants to continually improve the product.

DELIVERABLES: Visualization and reporting tools mean that the business value of models can be clearly demonstrated. If required, effective models can be deployed in applications as C code.

VENDOR SUPPORT: ISL provides excellent training courses in Clementine and data mining for different levels of experience. Maintained customers also have access to telephone and e-mail support.

DOCUMENTATION: Clementine has a comprehensive manual and user guide. Additional documentation includes case studies, course notes and a guide to getting started on a data mining project.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access