Figure 1: Logical Course Structure
Briefly walking through this model, we learn that a Course such as Data Modeling 101 can be a Lecture or a Workshop. Data Modeling 101 can be taught at one or more Locations. A Learning Track, such as the Data Track, must consist of one or more Lectures, yet only one Workshop. A Workshop, such as the Advanced Design Workshop, can require certain Courses as prerequisites, such as Data Modeling 101 and Data Modeling 101 Workshop.
This is a logical data model (with data elements hidden to keep this example manageable). On the physical data model, we can replace this subtype symbol with one of three options:
- Rolling down: Remove the supertype entity and copy all of the data elements and relationships from the supertype to each of the subtypes.
- Rolling up: Remove the subtypes and copy all of the data elements and relationships from each subtype to the supertype. Also add a type code to distinguish the subtypes.
- Identity: Convert the subtype symbol into a series of one-to-one relationships, connecting the supertype to each of the subtypes.
The Challenge
Using this Course structure as example, what are the pros and cons of each of these three options?
The Response
Figure 2 shows the rolling down technique applied to the Course structure, Figure 3 the rolling up technique and Figure 4 the identity technique. Challengers identified six factors to consider in selecting a technique. Figure 5 indicates the pros and cons for each factor for each technique. A discussion for each factor follows.

Figure 2: Rolling Down

Figure 3: Rolling Up
Referential integrity (database enforcement of rules) was mentioned frequently in Challenger comments. The logical business rules are compromised to a degree in each technique, with identity most closely mimicking the business rules. However, the one-to-one relationships on the identity model capture that a Course can be a Lecture, a Workshop or both. Therefore, as long as the logical model contains an overlapping subtype, the translation from logical to physical preserves the rules. The rolling down technique can enforce only those rules present in the subtypes. Monica Oliver, consultant, says rolling down and identity minimize null columns. If Workshop Difficulty Code is a mandatory data element in Workshop, we can enforce that it be populated. Rolling up, on the other hand, retains the supertype-level business rules. This structure retains the business rules from Course to Location. However, as Doug Jones, IT manager, states, "The rolled up relationship from Learning Track loses much business rule information regarding the required mix of Lectures and Workshops."

Figure 4: Identity
Balance refers to the quantity of data elements and relationships that are present in each of the subtypes and the supertype. The rolling down technique is an effective choice when the subtypes are weighted down with their own unique data elements and relationships. That is, the subtypes are more different than they are similar. The rolling up technique on the opposite end is a good choice when the subtypes share most of their properties. The identity technique is effective regardless of the balance of elements and relationships.
Usability means the model is easy to understand. An analyst generally finds it easier to interpret or navigate a data model when the model contains concrete business terms. Both the rolling down and identity techniques preserve the subtypes Lecture and Workshop. The rolling up technique, on the other hand, only contains the generic term Course and, therefore, could impact the communication ability of the model.
Performance in retrieving data is a major concern on the physical design. Doug Jones commented that the rolling down structure reduces joins across tables. "Workshop and Lecture both stand on their own. It saves the work of building and maintaining a third table." However, if there is a need to create a Course concept, joining across these tables could be complex and error prone. Data modeler Michelle Viljoen says, "There is no single table (Course) where a thin slice of all common course data is stored." The rolling up technique would have a similar issue if most queries were performed on Lecture and Workshop. The identity approach can lead to a performance hit, as you have to join across tables to create a single Lecture or Workshop view.

Figure 5: Factors to Consider for Each Technique
Scalability refers to how easily the data model can accommodate changing requirements. Abstract supertype structures, such as Course, are fairly scalable in the roll up and identity approaches, as new course types can leverage the existing course structure. Georgia Prothero, data modeler, points out, "The same application code can be used to process both workshops and lectures." The rolling down technique isn't poised for future growth. Abel Mendez, business intelligence specialist, comments, "Similar attributes and/or relationships added later may be implemented differently, further distancing the subtypes from their original logical model."











Be the first to comment on this post using the section below.