Mastering Product Data
Information Management Special Reports, August 2002
Products and services that are delivered to consumers play a central role in all enterprises, but this is no guarantee that data about products and services will be well managed. Indeed, a surprising number of enterprises fail to recognize what their products and services are when it comes to the building their information systems. These problems are becoming more prevalent in the modern economy as new products, services and entirely new sectors continue to appear. Even individual enterprises seem to be offering a complex mix of products and services that are increasingly difficult to keep track of. IT professionals, especially data administrators, need to know how to identify and manage product and service data if they are to build an effective data architecture that really meets their users' needs. The first step is to understand what these products and services are.
What is a Product?
Advertisement
Most IT professionals can probably accept that the term product covers both traditional products and services. Yet, many of these professionals – particularly in service organizations – may be hard pressed to indicate where products are represented in their data architectures. Here are a few reasons:
- Some service providers do not think that they deliver products because the clients that consume their services are not the same people or organizations that pay for or fund these services. Thus they do not have "customers" in the traditional sense – and if you do not have customers, it may seem that you do not have products. An example could be a charitable organization that runs several distinct funds for different aspects of urban renewal projects. Each fund may focus on a different area, e.g., health, education, etc. Different kinds of projects are delivered using these funds. The clients that benefit from them do not pay for them, and the donors who support the charity never consume the services it provides. Yet the array of funds run by the charity represents its products in a very real way – a distinct array of services dedicated to implementing distinct aspects of urban renewal.
- Some service providers may think that what they deliver is not quantifiable beyond some general measure such as "client service hours," and this lack of classification or "packaging" of different services leads them to ignore the fact that they have an array of different services. Many consulting companies are set up like this. Yet, the reality is that consultants have expertise in a few narrowly defined areas, and it is these areas of expertise that should be considered as products.
- Another problem is that products are often represented by "surrogates" in service organizations. For instance, the charity’s funds are a surrogate for a distinct set of services, just as the area of expertise is a surrogate for what the consultant actually provides to a client. There is no need to provide an exhaustive description of the services the charity or consultant provides, since they mostly map one to one with the fund or area of expertise. The problem often comes when the surrogates for the products are recorded in information systems without them being recognized for what they are – perhaps because of difficulties with terminology and traditional ways of thinking. A training company may be prepared to think about its course offerings as products since they are distinctly packaged and repeatedly delivered. But a health insurance company may offer distinct benefits for each health problem signified by a diagnosis code. To the health insurer the diagnosis code really represents a set of benefits and a framework for delivering them, not the disease itself. Yet it is more difficult for the health insurance company to think of "diagnosis code" as a product than for the training company to see "course code" as such.
How Can Products be Recognized in a Data Architecture?
Hopefully most IT professionals can agree that there is no distinction between product and service, and that all enterprises deliver products in some way. Yet they may not always be able to clearly identify product data in their enterprise data architectures. There can be three ways in which product data is treated:
- Product data exists and is well defined and clearly recognized as product data. This is the case in most traditional manufacturers.
- Product data is not recognized as such in the data architecture. However, the product data, or surrogate data that represents it, does exist. This is the case for many service providers
- Some product data is recognized, but other product data, or the surrogates that represent it, is not recognized as product data.
The point is that the product data must exist or no transactions representing the interactions of the enterprise with the world beyond its boundaries can be recorded in the data architecture. The issue is only being able to correctly recognize which data is product data.
Fortunately, there is a way to track down those entities in data models (or tables in implemented databases) that represent products. If products are represented at all in databases, they will be in the class of data called reference data. This is data that is rarely updated by an application, but is absolutely required by an application to make transactions flow. Reference data is often called "lookup tables" or "code tables," and generally consists of tables that have a code column as a key and a single non-key column that is a description of this code.
What is a giveaway about reference data tables that represent products is that they always have an unusually high number of non- key columns. This is because all enterprises have quite a lot of data to record about products. So these Product tables stand out from other reference data tables which only have two columns, a code and a description, e.g., currency, country, SIC code, credit rating, etc.
Another way in which product data differs from regular "lookup tables" is that quite a number of its non-key columns are foreign keys to real lookup tables. These tables are often used to categorize or classify the product, but may not be called Product Type, Product Group, Product Line, Product Classification, Product Family, etc. If they did have these names, the enterprise would already have recognized what its product data is comprised of.
Yet another characteristic of product tables is the fact that business users care about how they are updated. Few or none are the business users that care about the introduction of a new currency code or country code or SIC code – and updating these tables is often left to the data administration function. Yet business users probably have complex procedures and rules for updating product tables and are often personally involved in update actions.
By asking some simple questions, an IT professional can come up with a list of candidate tables that represent products. After that it is just a simple process to eliminate any of these tables that do not represent something that is being delivered by the enterprise.
Why is Product Data Important?
In a traditional manufacturing organization there is no need to ask this question, but in enterprises that do not recognize their product data as something truly distinct, problems can arise.

Figure 1: A Simple Association Between Product and Customer
Figure 1 is a simple data model fragment that appears over and over again. What it really shows is that the enterprise interacts with outside persons and organizations by delivering its products and somehow is compensated for this. The reality is that Figure 1 can be much more complex for many enterprises, but the principle is the same – an enterprise interacts with the outside world by the products it delivers and the input it absorbs to enable it to do so.
What this means is that enterprises that do not correctly recognize their product data are unlikely to be able to get their information systems to accurately reflect their activities. One common problem is the definition of product data in multiple applications, leading to redundant and inconsistent data. This can happen in organizations that understand what their products are, but it is much more likely to happen in organizations that really do not understand and think that product data is just some extra lookup tables.
Recognizing that product data is important is not enough. It also requires special management. Products of all kinds have a life cycle. This may be simple or it may be complex, but it has to be factored into the design of product data – even if it is just an inception data and expiration date for the product. Another feature of products is that there are persons or organizational units within the enterprise that have special responsibilities for the product. For instance, there may be a unit that is legally responsible for the product or a management authority that must provide clearance for shipping the product outside of the country. The roles that people and organizational units play in respect to products must be captured and can be quite complex.
One other feature about products is that they must be categorized and classified. This is why there are always a number of lookup tables that are connected to product tables. These categorizations and classifications are used to drive business rules that affect the delivery of the products, e.g., a certain type of product may only be delivered in the U.S.A. The lookup tables may also form important reporting structures for products – perhaps a large part of how the enterprise views its activities. However, as new products are introduced, the codes in these tables may not be sufficient to support new business rules, and new reporting schemes that are needed. Unfortunately, business users are usually not in a position to know how to add new codes to these tables to achieve what is needed. Even more unfortunate, many data administration personnel fail to see the bigger picture and treat these requests simply as one-off events to update some lookup tables.
Building a Corporate Product Catalog
If an enterprise wants to have a true picture of its activities it will need to build a central data store for its product information – a product catalog. This can act as a reference for other applications. However, it is quite difficult to have a single product catalog that acts as a reference for all applications. A large enterprise may have to have product catalogs at the division level, especially if these divisions are relatively autonomous. Furthermore, product seems to be a dimension in the majority of data marts and warehouses, so replication seems inevitable.
However, it is practical to have a single product catalog that acts as an authority for other applications, or at least to have product catalogs for each of the major divisions of an enterprise if these represent non- overlapping lines of business. Applications should be able to use a product catalog directly as reference data or indirectly as a source for building their own product data tables. Any applications replicating the product data will have to also replicate relevant lookup tables that are related to the product tables in the product catalog. Also, no application that references the product table – either directly or via replication – can be allowed to capture and store product data that does not come from the catalog. If this happens, these applications will subvert the enterprise’s understanding of what a product is and the business rules and reporting schemes that apply to it.
Actually getting to the point of accepting that a corporate product catalog is required would be significant progress for many organizations. It is much more important to recognize that all enterprises have products, that these must be clearly identified in the enterprise’s data architecture and managed in way that reflects the unique nature of product data.
Malcolm Chisholm is an independent consultant focusing on metadata engineering and data management. He is the author of How to Build a Business Rules Engine and Managing Reference Data in Enterprise Databases and frequently writes and speaks on these topics. Chisholm runs two Web sites http://www.bizrulesengine.com and http://www.refdataportal.com. You can contact him at MasterDataConsulting@gmail.com.
For more information on related topics, visit the following channels:





