A great way to sharpen our analysis and modeling skills is to continuously address real-world scenarios. A modeling scenario along with a solution appears each month in this Design Challenge column. The scenario is first emailed to more than 1,000 "Design Challengers," and then all responses, including my own, are consolidated and the best appear here. If you would like to become a Design Challenger and have the opportunity to submit responses, please add your email address at www.stevehoberman.com/designchallenge.htm. If you have a challenge you would like our group to tackle, please email me a description of the scenario at firstname.lastname@example.org.
A university needs a unique identifier for an individual. There is a Person Identifier assigned by the university to all individuals on campus, but Person ID by itself does not guarantee uniqueness. Five percent of the time, an individual has two distinct Person IDs. Although business rules exist that help enforce uniqueness, errors happen. For example, Bob might be assigned a Person Identifier for his role as a student and a different identifier for his role as an instructor. Social Security number is a possible choice as a unique ID, yet we know it has both legal and data quality issues (e.g., foreign students have a temporary Social Security number until a permanent one is assigned). What would you choose as the unique identifier for an individual? Would you find a way to fix the data quality problems inherent in Person Identifier, or are there other elements that can be used instead?
Let's assume Bob having a different Person ID for each role such as student and instructor represents the scenario that causes these uniqueness issues. So, this challenge becomes distinguishing the person from his or her roles. The unique identifier for an individual can remain Person ID if the business process and data structures enforce and maintain uniqueness.
The process needs to ensure that new Person IDs are created uniquely and also continuously monitor all existing person records for any signs of redundancies. Monica Oliver summarizes this well. "The issue is not that there is no unique identifier; rather, it's that there are process/application integration failures that allow the same person to be identified more than once. The only way to fix that problem is to fix the process and/or application integration problems causing it."
Carol Lehn, senior database specialist, proposed criteria for a person alternate key and to "tighten up the business rules and implement a manual exception process. Even George Foreman's sons (George Jr., George III, George IV, George V, and George VI) would be identified as unique individuals with these criteria, and if he had twins, they would be routed to exception processing to determine if one or two Person IDs were appropriate."
In terms of a person data structure, the integrated subject data model might look like Figure 1.
Figure 1: Integrated Subject Data Model
A person can play many roles; a role can be played by many people. A role can be a student, instructor or any other function of importance to the university. We can easily create a set of code values for each role (e.g., 01 or "S" equals student). We can continue to use Person ID as the person surrogate key if we also have a good alternate key based upon real business elements. One or more alternate keys will assist business users in accessing people records as well as assist the development and support teams in maintaining uniqueness. Here are three of the good suggestions for a person alternate key:
- A combination of last name and first name and one or more of the following: date of birth, place of birth, mother's maiden name, gender
- Hashing the Social Security number
As analysts and modelers, we continuously find ourselves asking the business, "Why?" For example, "Why do you need this report?" or "Why do you enter this information in three places?" Emma Fortnum, application architect, rightly asked why for this challenge as well. "What I would do is question if the university really needs a unique Person ID across all the roles. Do they really need to know that Bob the student is the same as Bob the instructor?" A holistic view of a person should only be created if there is significant business value. An organization-wide program such as a BI initiative or an enterprise resource planning implementation is usually the driver of the need for holistic view.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access