Continue in 2 seconds

Why Data Modeling?

By
  • Michael McKinlay, David Warren
Published
  • March 01 1998, 1:00am EST
More in

DM Review is pleased to present a series of articles on the importance of data modeling. In the next few months, our readers will learn about the fundamental aspects of data modeling, how it relates to diagrams and tools, and what its connection is to such things as the Internet, data warehouses, data mining, objects, client/server systems and business rules. In the kick-off article in this series, Michael McKinlay and David Warren use humor and a touch of irreverence to make the point that data modeling is extremely basic and not something that followed from the invention of the microchip.

Consider the following statement of methodology: First, identify by conceptual analysis the simple elements to which all more complex objects may be reduced. Second, synthesize an understanding of the whole by perceiving the necessary relationships in which these elements must stand to one another.

The above method is from:

  1. René Descartes (1596-1650)
  2. Clive Finkelstein (the "father" of Information Engineering)
  3. The PowerDesigner DataArchitect User's Manual\
  4. Peter Chen (who first described the Entity-Relationship Model in 1976)

The Answer Is . . .

René Descartes, the father of analytic geometry and considered by some to be the founder of modern philosophy, described that methodology in 1637 in his Discourse on the Method of Rightly Conducting the Reason and Seeking for Truth in the Sciences. Whew, what a title! If Descartes were alive today, he would simply call it "The Method," copyright it and spend his time promoting a world tour.

Surprised? Discussions about the origins of data modeling usually begin with Peter Chen's 1976 paper, "The Entity- Relationship Model Toward a Unified View of Data," which first appeared in the ACM Transactions on Database Systems. But the ideas that make up data modeling precede our current computer technology. They even precede Descartes by at least two millennia.

A Story

Once upon a time, a group of men came to Siddartha the Buddha (c. 563-c.483 B.C.) and said, "We have been having a big argument. We want you to decide which one of us is right and which is wrong."

One man said, "I know my answers are right." But the others said, "Tell us, Buddha. What are the true answers?"

You might be wondering if some bizarre computer error has printed text intended for some religious journal into this article. But, no, this story is indeed part of our discussion. It is true that these men did not have computers; however, they were trying to construct an information system. And, really, weren't they behaving like many project teams you and the rest of us have been a part of? And while the Buddha never thought to arrogate to himself the title of CIO, he was being appealed to much like any other boss would.

To Continue

Buddha replied, "I will answer by telling you a story. Once a king said to his servant, 'Find six men who were born blind and bring them to me. Then bring me an elephant.' The servant did this. The king said to the blind men, 'Here stands before you what we call an elephant. Each one of you may touch the elephant; when you have done so, I want you to tell me exactly what an elephant is like.'

"The first blind man touched the side of the elephant. He said, 'Your majesty, the elephant is like a wall!'

"The second blind man felt the elephant's leg. He exclaimed, 'No, an elephant is really like the trunk of a tree!'

"The third blind man felt the elephant's trunk. He said, 'You are both wrong. An elephant is like a snake!'

"The fourth man felt the elephant's ear. He said, 'It is plain that an elephant is like a fan!'

"The fifth blind man felt the elephant's tusk. He said, 'You are all such fools! An elephant is really like a spear!'

"The sixth blind man felt the elephant's tail. He said, 'I have the real, absolute, final truth about the elephant. It is exactly like a rope!'

"The men argued loudly about who was right. Finally the king shouted, 'I command you to be quiet!'"

(How many projects have ended with nothing to show, or have fallen so far behind each revision of the schedule that they never went into production, because, although each partial view of the project may have been "elegant," the pieces did not fit together and the team could not agree on an "enterprise" view? And how many such projects have resulted in management-- knowing nothing else to do--just firing everyone?)

Here the Buddha ended the story. Then he turned to the men who had come to see him and said, "We are all like blind men in this world. Let us not quarrel over what we cannot be sure of."


Figure 1: An Elephant?

So What?

For those of us building "information systems" today, Buddha's parable has a particular relevance. Without a model of what we are building, we are like these blind men: we may be partly right, but we are probably mostly wrong.

Recently, while this story was being told to a group of third graders, each child took on a role as one of the blind men and drew one view of the elephant. Figure 1 shows the result of their work.

This model of an elephant does not look much like an elephant. Sometimes, the "information systems" that get built don't work much like information systems. When we have a poor model (or no model at all) of the information system we have to build, we do not have an objective view of the system. Instead, we have many subjective views which may be partly right but are certainly mostly wrong.

This is why, in building software systems, it all comes down to a data model--a model that reconciles the many partial views of the system. We do not have the luxury of agreeing to disagree.

Why Buddha?

Those of us who develop information systems like to believe that because we have such advanced technology at our disposal, we are able to think so much more profoundly than the so-called great minds of history. If Euclid could have bought a cheap calculator, then he would have been remembered for, say, calculus instead of geometry. If Newton could have worked with a super collider rather than an apple, we'd still be living in the world of Newtonian physics. If Babbage could have ordered microchips, then he wouldn't have merely designed a mechanical computer but maybe actually created some snappy databases.

Our technology has brought us through several paradigm shifts, we tell ourselves, and as a result we have to rely on new thinking if we're going to achieve so much more than those technologically deprived people who have come before us--right?

The purpose of this article is to say as loudly as possible, "WRONG!" Our technology, which is the product of our brains, can only be used effectively if we think clearly. And when we're talking about information systems technology, we're talking about thinking clearly about information--that is, the data with which we and/or our clients work.

Don't let that word "data" fool you into thinking we mean something peculiar to computers. Buddha's story demonstrates that processing data in order to understand the enterprise of existence has been a fundamental concern of human beings.

We could have chosen a more likely figure to point to as an example of pre-computer-age data modeling and the implications of that fact. You can always count on there being someone in the reading audience who says, "Well, of course, Descartes." "But Buddha?" you are screaming.

Buddha is probably a startling choice for an article about data modeling. We like starting with Buddha because anyone who advocates not arguing like six, or even twelve, blind men is giving good project management advice.

Buddha is also stating a kind of first principle of data modeling. Data modeling is not about that on which we cannot agree; rather, it is about what is. Data modeling provides the rules (embodied in tools) by which we can document what our enterprises have done, are doing and need to do. We are not qualified to assert that Buddha would find developing information systems the wisest activity by which to collect a paycheck. But Buddha's story presents a key idea embodied in data modeling: you cannot understand (and thus represent) something unless you comprehend it completely--enterprise-wide.

The purpose of this article is to suggest to information systems professionals that data modeling is so fundamental that to develop systems without first understanding the structure of the information is as foolish as building a house without blueprints and as dangerous as crossing a busy Interstate with closed eyes.

Take Another Quiz

Consider the following rules:

1. Data must be certain and auditable.
2. Data must represent that which is real as contrasted with that which is an appearance only.

The above rules should be most properly labeled:

  1. Plato (427-347 B.C.) on knowledge
  2. Peter Coad on Object-Oriented Programming
  3. DataAtlas Mobile Edition Developer's Guide on design issues
  4. What is meant by "Life Cycle of the Entity"

Hear Another Story

Some one hundred years after Buddha lived Plato. Maybe more than Buddha, Plato was convinced that knowledge is attainable. However, Plato knew that the attainment of knowledge--of being able to think about the data clearly (which is the essence of data modeling)--is not easy.

In one of his most famous stories, the myth of the cave, Plato described a world of human beings chained to the wall of a dark cave with only the ability to look ahead at shadows of objects, animals and people. (You don't need a Ph.D. in literature here to see that Plato uses a metaphor to express his lack of faith in the ability of human beings to perceive and think clearly.) One day, one person breaks free and escapes from the cave into the light of day. With the aid of the sun, that individual sees the real world for the first time and returns to the cave with the message that the only things they have seen before are shadows and appearances and that the real world awaits them if they are willing to struggle free of their bonds.

Plato provides a terrific metaphor for developing data modeling competency. Learning to think about data clearly will involve some struggle with our bonds of bad habits and ignorance. But getting to that plain-as-day understanding of the structure of information will be worth the struggle. For example, many data professionals have found it literally startling that "purchase order," "acknowledgment," "receipt," "invoice," and "payment" are not separate entities but rather one--"negotiation"--which goes through stages (a life cycle). (And that's one good answer to the quiz, but we were looking for Plato.) Life cycle of an entity is, of course, only one of the things data modeling teaches us.

If, instead, we accept our first pass that these things "appear" to be different entities and model them as such, then the implementation of the information system will be more complex, vastly inefficient and subject to maintenance and enhancement nightmares. Too often, we model this way because each state of what is really a single entity is "owned" by a different corporate department that believes its view of the truth.

If we get beyond appearance and understand the life cycle of this entity and the life cycles of other important entities of our enterprise, we simplify our model. When we simplify our model, we implement a simpler and more efficient information system, easier to maintain and enhance, which makes it more powerful.

It is literally worth a lot of money to our enterprises for systems developers to get above the shadows of the cave into the daylight of truth.

Conclusion

Both Clive Finkelstein and Peter Chen would agree with Descartes' fundamentals, which are the same principles that have shaped tools such as, for example, SILVERRUN. Peter Coad would have no argument with Plato; nor should anyone who wants to take full advantage of SQL. The reason for this unanimity is simple: data modeling is about how we think-- how our brains work. Let's face it, how the human brain works hasn't changed since Buddha, and it's not going to change any time soon. And we cannot build an information system that doesn't "think" the way we do.

However, computers aren't anywhere near as smart as real human brains. Bob Schmidt has aptly noted this fact in his phrase "the stupid-as-a-stick computer." Data modeling takes how we think and makes that process very rigid so those dumb boxes can get it.

Descartes' call for the use of analysis to reduce complex objects to their simple elements becomes an even greater imperative when we're talking about computers. The processor clock speeds are getting faster, but that doesn't change the need Descartes articulated to synthesize an understanding of the whole by understanding the necessary relationships among the elements.

It is because we have to see for the computers that we must take Buddha's parable to heart and achieve the enterprise view. As human beings we can understand Plato's metaphor of struggle, while the computer knows nothing of making an effort.

These are the reasons for data modeling.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access