Everyone remembers the classic Looney Tunes cartoons featuring Wile E. Coyote and Road Runner. No matter how clever that coyote was, the roadrunner always managed to get away. Many people think that integrating data is integrating data.
The data itself (the type or quality of the data) has little influence on the difficulty level of an integration project. That’s the same kind of thinking that would lead a coyote to think catching roadrunners is as easy as catching any other kind of bird. The truth is that product data can be some of the most challenging data to integrate. Product data has been the source of many sleepless nights for integration specialists around the world. Why do they keep doing it?
As a report from Automotive Aftermarket Industry Association pointed out, assuming $100 billion in transactions between suppliers and direct customer in the aftermarket each year, the shared savings potential tops $1.7 billion annually by eliminating product data errors in the supply chain. That’s just potential savings in one industry, in one year.
Integrating this data carries a tremendous value that far more than balances the difficulty. But the fact remains that cleanly integrating that data across multiple applications, data stores, countries and businesses can be as elusive a goal as catching that Road Runner.
What is Product Data and Why Should You Care?
There’s a lot of data out there in the corporate world (documents, images, log files, movies, etc.) but the majority of integration projects center on two main types of data: contact data and product data. Contact data is data about people. Employees, customers, prospects, suppliers, business partners and the guy that delivers sandwiches are all contact data. Actually, their names, addresses, phone numbers, email addresses and sandwich preferences are contact data, also called party data.
A lot of integration best practices, expertise, references, content libraries and resources of all types are available for tackling the tricky business of integrating contact data. If you want to track the history of what items those people recently purchased in your online store, what parts your auto repair shop should order to keep their cars running or how much bread, lunchmeat and tomatoes to stock at the local grocery store to keep people in sandwiches, that’s product data.
Integrating product data is often thought of as a supply chain or manufacturing industry problem, but anyone who sells products, particularly in multiple locations or from multiple suppliers is going to need to integrate their product data. You’ve got to know if the “7 1/4 28 stain” is a 7 1/4 inch, 28 tooth stainless steel circular saw blade or if it’s a 7 1/4 centimeter, 28 thread stainless steel screw. You’ve got to make sure that you keep track of how many suppliers have sent you saw blades or screws. If one describes their saw blades as “7 1/4 28 stain” and another describes them as “circ 28 rip 7 1/4" and you put in orders to both suppliers for 5,000, did you just unintentionally buy 10,000 saw blades of the same type? Whoops, it’s time for a sale on saw blades. As a data integrator, you could find yourself looking for a new job at lunchtime instead of eating a sandwich.
What Makes Product Data So Tricky?
- Description fields. Description fields are the essence and the bane of product data integration. Nearly every product data set seems to have a free text description field into which everyone freely puts text. The unrestricted, wildly varying entry formats, abbreviations and misspellings that get crammed into those fields are enough to make the wiliest data manager’s ears droop. Anyone who has ever had the dubious joy of trying to tame unstructured data knows that it can take a lot of creative pattern searches, multiple passes and even a natural language processor to rip the sense out of those text descriptions and put that sensible data into structured property fields.
- Standards that don’t get used. Standards are wonderful - a boon to integration everywhere. Product data has national standards like National Stock Number (NSN), international standards like Export Control Classification Number (ECCN) and United Nations Standard Products and Services Code (UNSPSC), as well as industry standards like eCl@ss (energy) and Product Information Exchange Standard (automotive). The trouble is that not nearly enough companies actually use them. International standard product codes in particular could be as handy as a ready-made roadrunner trap, if they were consistently used. Many manufacturers and suppliers don’t see the point in adhering to standards in the way they store product data. Somebody apparently thought ACME jet-propelled roller skates were a great idea, too.
- Lack of consistency. There is no consistent way to describe or store product data, and due to the wide variety of product data, it’s unlikely that there will ever be a consistent set of product data properties or formats. In contact data, all phone numbers in the U.S. have 10 digits, with the first three as the area code. Addresses generally have street, house number and apartment or suite number on one line or two lines, and city, state and ZIP on another line, often separated by commas, if not placed in separate fields. There is no corresponding consistency in product data. The properties of a bolt of cloth, for instance, might include thread count and color, which would be a very strange way to describe a computer chip, a diesel engine or an anvil for dropping on roadrunners.
- Language barriers. We live in a global economy. Parts are manufactured and components assembled all over the world. Suppose you’re a data manager in the airplane manufacturing industry, and the airplane engines your company uses are preassembled in a plant in Taiwan. What would happen if your American requisition system wasn’t properly integrated with their Chinese product descriptions? It could mean ordering the wrong model of a $50,000 engine and your company having to eat that cost. And how much revenue will be lost while the production line halts, waiting for the right engine to arrive? Ouch.
- Missing information. A lot of the information that would make sorting, searching and understanding product data far more efficient is often not included with the data. How about a NAICS code or DUNS number for the company that manufactured or supplies that product? How about a classification for that product, so you can tell at a glance how it’s different from, or similar to, other products in your inventory? It would be nice to know if this product had a warranty, and when it expires, too.