Batman versus the Evil Definitions
Batman once said, “It’s not who I am underneath, but what I do that defines me.” Can we apply this same reasoning when defining key terms of our data models?
It can be a daunting task to define a simple term, such as “Customer,” but it could be a lot easier (and maybe just as effective) to instead define what these terms do. So can we define something in terms of what that something does, instead of what something “is”?
For example, someone recently shared with me that because their project could not come up with an agreed upon definition for Customer, he defined Customer as any organization who opens a contract with his company. So any organization with a date in the Contract Open Date field is a customer. (I am simplifying this example, but you get the idea.)
Does defining the actions something performs solve our definition issues? Or are we instead adding complexities, for example, assigning more than one meaning to the same data element (e.g., Contract Open Date is used both for the actual contract date and to distinguish customers from other organization roles)? Have you ever defined a term by what that term does instead of what that term “is”? If yes, were you satisfied with the outcome?
The responses from the Design Challengers can be grouped into three categories: those that recommend defining something by its actions, those that recommend defining something by its actions as part of the solution and those that do not recommend defining something by its actions. Below I chose two responses from each of these three categories, followed by a summary of what I learned from this challenge.
Defining Something by its Actions is an Effective Technique
Madhu Sumkarpalli, business intelligence consultant, says, “I think defining the term based on its action is a better idea. That way we can be specific about it or at least close to specific, rather than being generic and abstract. I think the ‘thing’ is what it does. Of course, one can define ‘bird’ based on just its general characteristics, which would put it in the animal kingdom. However, based on its action, we can arrive at a specific definition that would define it appropriately and paint the proper picture, even if others haven’t seen it.”
Vikas S. Rajput, database specialist, says, “Yes, there are certain times when we have to take an approach where not the ‘defined role’ or definition, but the action of the actor, defines the term in our data model. I will give you an example. At the front desk at a hospital when a patient is admitted, somebody needs to log in the patient details. That person could be anyone when you have kiosks all over the place (as happened with one of our clients). Here you don’t need to define the role, per se, you only need to capture the ‘first attendant’ details, which could be an employee ID.”
Defining Something by its Actions is Part of the Solution
Amarjeet Virdi, data architect, says, “Aren’t data entities meant to represent real life objects? And don’t real life objects perform functions? The questions then are:, Is the entity in question defined entirely by the ‘function’ it performs? Without the function, does the entity cease to exist? In this case, the object is a customer, which is an organization or person. If the action of signing a contract makes them a customer, what happens when the contract ends? What does that entity mean to your business? Does it have no existence outside the function it performs? Will it pass into a new lifecycle stage? Does it change names? When the function is completed, will the entity cease to exist or have no value to the business anymore? So, going back to Batman, when the bat suit is off, does Batman cease to exist for Gotham City? Batman and Bruce Wayne are entirely different entities, but will we never need to see an integrated view? Don't Bruce's life and actions help us understand the structure and function of Batman?”
Raymond McGirt says, “I'm going with the good old standby: it depends.
Today, I went into a local hardware store. I consider myself a customer, but today, I did not buy anything. I returned something. There are different business rules to be fulfilled when I buy something, as opposed to when I return something. But, either way, am I not a customer? A different role is being performed when I return something, but I'm still the same person. Either way, complexity increases. Either a single Customer performs two or more unique activities or there is a different term for each type of activity a non-employee could perform in the store.”
Defining Something by its Actions is not Recommended
Wade Baskin, senior database architect, says, “I've always taken the approach that mixing process with data is a dangerous practice. Data should have one, and only one, definition, regardless of the process. If the data changes as it matures or the process moves along, then the change is reflected as either a status code or a different data element; never change the current definition of an element based on process or location. An even more dangerous practice is to change the definition of an element based on the presence or absence of another data element. For the purist, this breaks the laws of normality, where the element is no longer relying on the key and nothing but the key for its presence/definition.”
David P Reynevich, chief architect, says, “The ‘is’ part is (relatively) permanent; the ‘do’ part, if tied in with ‘how we do it,’ changes regularly. Allowing fields with multiple meanings is dangerous and should be avoided in principle. Whatever time savings that may look good now will almost certainly be lost many times over as future designers and systems builders stumble over the hidden meanings in the data, to say nothing of the additional complexity that this will bring to straightforward data queries.”
Defining a term by what it does is effective, at least as a starting point, because most business professionals define things by the roles they play (e.g., a person playing the role of a customer). However, taking such an approach may eventually lead to data integration issues (such as whether a Customer and Prospect can ever be the same Person), hidden business logic (e.g., Contract Open Date has multiple definitions) and what will happen to the thing when the activity it is performing ceases. Great thoughts!
Note: If you’re interested in chiming in on the discussion by becoming a Design Challenger, sign up at stevehoberman.com.