Continue in 2 seconds

What Exactly Is A Data Model? Part 3

Published
  • April 01 2003, 1:00am EST
More in

The two previous articles in this series described data models used to represent the business owners' views and the architect's view. This final article is concerned with the logical models used by both relational and object- oriented designers.

ROW FOUR: DATABASE OBJECTS SEEN BY RELATIONAL DESIGNERS (DATABASE DESIGN)

Designers are not looking at things of significance to the business. They want the boxes of a data model to represent something meaningful to a computer application. In the case of relational database designers, their boxes represent database tables; and where the architect had attributes, they put columns. Relationships are implemented with foreign keys, and unique identifiers become primary keys. (Object-oriented designers, on the other hand, are concerned with classes and associations, as will be described.)

With these analogues, the same graphics that portray a conceptual data model can be used to portray the logical data model that is a relational database design. Unfortunately, not everyone realizes that the meaning of boxes in the two kinds of diagrams is different. A person in the real world is very different from a "Person" table.

While Dr. E.F. Codd originally set the standards for relational database design with his rules for normalization, these rules actually were intended to provide guidance to the conceptual model. In fact, designers, even in a relational environment, often denormalize database designs by replicating columns in various places. They do this to improve the performance of specific functions; although as it happens, by improving the performance of some functions, the designer at the same time decreases the performance of other functions. For this reason, it is important to recognize that the designer's job is fundamentally one of making trade-offs. These trade-offs are what make the relational logical model different from the conceptual entity/relationship data model.

A relational database management system will constrain the design to have only binary, one-to-many relationships. Moreover, by definition, the design model must include all columns on all tables. A graphic model cannot show the program code that is written to implement constraints, although the optionality of attributes and foreign keys can be shown. (Actually, there are other modeling notations available to describe the program code itself. These include action diagrams and decision tables, for example.1)

Because these models will only be seen by designers and programmers, aesthetics is not as important as it is when the models are presented to the public. Even so, it is important for models to be as readable as possible by those who will use them as the basis for a system design.

The model for a relational design should include:

Tables

  • Name

Columns

  • Name
  • Data type (format)
  • Optionality

Primary keys

  • Component columns

Foreign keys

  • Component columns
  • Column that a key column refers to
  • (Cardinality is implicit in foreign key structure.)

As with the other modeling techniques, the CASE tool supporting a relational design model must be able, behind the scenes, to capture:

  • Table and column definitions
  • Column derivation logic, where applicable
  • Business constraints ?­ At this point, the code for implementing these constraints is being designed.

IDEF1X, widely used by the U.S. Federal Government, is firmly entrenched as the notation of choice for modeling relational database designs. Also, Oracle has included a database diagrammer in its Designer CASE tool. This tool uses a simpler syntax than IDEF1X.

ROW FOUR: CLASSES SEEN BY OBJECT-ORIENTED DESIGNERS (OBJECT MODEL)

A different kind of designer is the one creating object-oriented applications. The object-oriented designer sees the world in terms of classes where a class is a category of computerized objects. Classes are related via associations that are analogues to, but very different from, the relationships described for the conceptual model. A class in an object-oriented program is a piece of code that describes the data to be operated on by one or more methods (program processes). An object is a specific occurrence of data that describes an instance of a class.

The object-oriented designer brings into play some concepts that are not used by either the conceptual modeler or the relational modeler. First, an association is not an assertion of structure so much as an assertion that one can navigate from one class to another. Program code is required to navigate in each direction. For this reason, it is useful to be able to describe the fact when navigation in one direction is all that is necessary. UML allows the addition of an arrowhead to the relationship line to show this.

Also, both methods and attributes are only visible (can be seen or acted upon) to a certain degree. Each may be visible to all or may be visible only to objects that are in a class that triggers the method or acts upon the attribute. Visibility of an attribute may be explicitly shown on a UML drawing.

While object-oriented designers prefer to use surrogate keys to identify objects in classes, there are circumstances where, from the point of an associated class, it is useful to be able to identify occurrences in another class in terms of something else it contains. For example, in the relationship between an ORDER and a LINE ITEM, it may be appropriate to specify that for each ORDER there can be only one LINE ITEM for each PRODUCT. An addition to the UML notation allows this qualified association to be specified.

Because object-oriented classes are in principle derived from a conceptual model, just as relational tables are, it is reasonable to expect that an object-oriented program can make its classes persistent by storing their data in relational tables. The problem is that, like the relational designers, object-oriented designers often depart from the conceptual model. Object models and relational models may, in fact, depart from the conceptual model in completely inconsistent ways. They don't always talk to each other happily.

The object-oriented design model, then, should show:

Classes, with names

Attributes

  • Name
  • Format/data type
  • Visibility
  • Cardinality

Operations

  • Name of the program carrying out the operation
  • Visibility

Associations (OO-speak for "relationships")

  • Association name
  • Optional role names
  • Cardinality
  • Optionality
  • Navigation Direction

Behind the scenes, any CASE tool portraying an object-oriented design model must capture:

  • Definitions of all classes and attributes.
  • All business constraints, as implemented in program code.
  • The program code implementing methods.

As with the relational design model described previosly, aesthetics is less important here than it is for models presented to the public, although programmers are also consumers of the model; and in the interest of their working effectively, the drawings should be laid out as clearly as possible.
The notation that was invented to support object- oriented design is the UML. In addition to what is required for conceptual data modeling, as we have seen, it has a number of features specifically to support object-oriented design.

In short, a data model can:

  • Represent the language of an organization.
  • Represent the fundamental structure of an organization.
  • Represent data structures as manipulated by a particular technology.

These are very different uses for the same collection of boxes and lines, which we fail to see at our peril. Figure 1 shows the different views of data modeling and the characteristics of each.

Figure 1: Comparing the Different Views

References:
1. For a description of these, see Hay, David C. Requirements Analysis: From Business Views to Architecture. Upper Saddle River: Prentice Hall, 2003. pp. 187-192.
2. Zachman, John. "A Framework for Information Architecture." IBM Systems Journal, Vol. 26, No. 3. (IBM Publication G321-5298). See also http://www.essentialstrategies.com/publications/methodology/zach man.htm.
3. Tsichritzis, D.a.D. and A.C. Klug. "The ANSI/X3/SPARC DBMS Framework Report of the Study Group on Database Management Systems." Information Systems. 3(3). 1978. p. 176-191.
4. Zachman and Hay mean basically the same thing but use different terms to identify some of the rows. For a discussion of these differences, see Hay, David C. Requirements Analysis: From Business Views to Architecture. Upper Saddle River: Prentice Hall, 2003.pp. 5-6.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access