The common master data entities - product, customer, account, Financial Instrument - are widely known and familiar to most data professionals. We call them entities, but are they really?
Technically, the term "entity" is reserved for an instance of a type. An individual person is an entity. However, in a data model, what we call entities are more properly termed "entity types."
Unfortunately, there is confusion of terminology, and entity is now used to refer to entity types. Instances of entities are rarely thought of by data modelers. After all, there is no way they can be expressed in a data model. Thus, they seem to be forgotten about. So before we begin, let us understand that when we use entity, we mean entity type, and we will use "entity instance" to refer to what was formerly called entity.
This is important because the question is if a master data entity is really an entity or whether it is something else.
What is an Entity?
Individual things in the universe tend to be distributed in types rather than each being an infinitude varying in the attributes it possesses from the next thing. Why this is true is a mystery that has exercised the minds of philosophers. Whatever the origin of the phenomenon, our intellects are so attuned to it that we can abstract the attributes from one individual thing and another individual thing - such as two apples - and realize that they have the same nature - "apple." This nature exists only in the conceptual order - in our minds. In data modeling, we call these natures "entities." They are each a set of attributes that defines a nature, of which individual instances are implemented in the real world.
Science loves these natures because they provide a foundation of uniformity. What is found to be true of one instance will likely be true of the next if both share the same nature. Laws and rules are uncovered that tell us how things that share natures behave. Even the natures can be grouped together into hierarchies and more general rules and laws discovered for them.
What is a Collection?
Now that we described what an entity is, what is a collection? A good example would be the 10 things you or I would choose to save if our houses caught fire. For me - other than family members and the dog - it would include photographs, my laptop, my wife's jewelry, a bottle of expensive wine I will probably never open and so on. None of these things are instances that belong to the same nature - to the same entity. They have very few attributes, if any, in common, and the ones they have in common are only accidentally so. Yet I hold these items as a whole concept in my mind. They are an assemblage, or a collection, not instances of an entity.
Master Data Entity Subtypes
If we consider the master data entity Financial Instrument, it is immediately apparent that there are several different kinds of Financial Instrument. There are equities, bonds, options, exchange traded funds, indexes, credit default swaps, currency, precious metals and so on. Inevitably, there will be a code table called something like Financial Instrument Type that has one record for every type of Financial Instrument. In the Financial Instrument table, there will be a column for Financial Instrument Type Code which is a foreign key from Financial Instrument Type.
But what is the definition of Financial Instrument Type? Many would assume it to be something like "type of financial instrument." That is not a definition. When we ask what purpose this table serves, we are told that it identifies all the subtypes of Financial Instrument. Really? If that were the case, then we could have a proper definition of Financial Instrument Type, or indeed the supertype Financial Instrument. Remember that enumeration of a set of examples is not a definition.
We are confronted with the fact that Financial Instrument is a collection. It is not an entity. What attributes could an ounce of gold and a share of IBM common stock have in common? If it is asserted that they both have a value, then so does a hamburger, but that is not a financial instrument. We are back to the same situation as the things we would save in a fire. A financial instrument is, in fact, anything that the enterprise chooses to trade. There will no doubt be new kinds of financial instruments in the future, and instances of them will be added to the Financial Instrument table.
Where Does This Leave Us?
If master data entities can be collections rather than entities, why is this so? The answer, I think, is that we group things together because we have to manage them as a whole, or we create attributes (like value or artificial groupings) for these things. Yet they are still collections because they are composed of what we call "subtypes" but which are closer to entities (entity types, actually) in their own right. If this is so, then it is vitally important to figure out what the subtypes are in master data "entities." We must figure out what attributes belong to each subtype and how these subtypes behave.
As yet, we do not have the theory and methodology to deal with anything other than entities in master data management. That should not stop us from recognizing that we are more often than not really dealing with collections.