It is amazing to me that there is such fervent disagreement on the purpose and nature of the widespread activity of data modeling. Based upon the "80%" rule in his research, Dr. Simsion characterizes data modeling as "the specification of the initial conceptual schema to meet agreed upon business requirements." In order to produce effective results, I believe it is important to have clarity on this point, and thus Dr. Graeme Simsion's book is such a valuable contribution to the field of data modeling. This advanced book takes a deep dive into the nature of data modeling by focusing on the fundamental question, "Is data modeling better characterized as: a) a descriptive activity, the objective of which is to document some aspect of the real world, or, b) a design activity, the objective of which is to create data structures to meet a set of requirements." The analysis of this question challenges some widely accepted views and reveals many useful insights in understanding and practicing data modeling.
With a focus on this question, Simsion takes us on a journey covering both the academic literature as well as conducting research from practitioner perspectives and experiences, thus providing us with a view into data modeling from many perspectives and angles. The book synthesizes information from 489 publications, summarizes the results of interviews with 17 of the top thought leaders in data modeling, shares the results of three carefully designed surveys, and shows us the results of three data modeling research projects where practitioners are asked to develop data models using different scenarios. Instead of focusing on his own views of data modeling from his vast experiences in academia and as an expert practitioner, he takes a step back and focuses on understanding and clearly articulating the views and experiences of others. He has succeeded in developing and publishing a four-year intensive study that includes over 450 participants with widely diverse data modeling backgrounds from non-modelers, novices, experts and thought leaders. The research activities were designed to produce statistically reliable findings by using a variety of different types of surveys, asking a broad mix of survey questions to diverse audiences, geographically dispersing the research across four countries and carefully analyzing the results using objective statistical methods. He then summarizes the findings of the research to form insightful conclusions.
There are four parts to this book. The first part introduces the questions and the approach. The second part reviews the theory of data modeling and focuses on academic and thought-leader perspectives. The third part presents the results of an examination of data modeling practice and practitioners via surveys and data modeling research exercises. The fourth part synthesizes the results and comments on the implications of this research.
In the first part, Dr. Simsion frames the question of whether data modeling is more about description or design and states the strong conflicting views that exist. He shows that the literature more often refers to data modeling as a process of describing information requirements, which is a basic tenet that he courageously challenges throughout the book. He explains the importance of the question and how the understanding of this question is paramount to performing data modeling well. He also explains how he organizes the book using several frameworks such as Lawson's Characteristics of Design, a framework that describes questions to distinguish design from description, as well as the 4Ps framework (Environment (Press), Process, Product, Person). I found that the organization of the book facilitated the exploration of the question from a wide variety of viewpoints.
In the second part on theory, Dr. Simsion provides extensive references to various definitions of data modeling, reviews each stage of the data modeling process in relation to the description/design question, relates Lawson's characteristics of design to data modeling, and shares the results of interviews with data modeling thought leaders regarding what they think regarding the nature of data modeling. He shares research on several interesting questions such as whether the data model is describing "what is" (descriptive) versus what "might be, could be or should be" (prescriptive), whether there is a correct or "gold" standard that most correctly models the information requirements, and how much of data modeling is about creativity. In one section, he provides a very useful breakdown of the main reasons why data models may vary for the same scenario, namely, because of construct choices (attribute versus and entity), levels of generalization and using different categorizations at the same level of generalization.
In the third part on data modeling practice, the book shares the process and results of three surveys and three data modeling research activities where data modeling practitioners are asked to create a data model for a given scenario (each research activity uses a different scenario). An impressive aspect of this study is the comprehensiveness of each research component and the idea that the question is asked from such different angles. For example, one survey focused on the scope and stages of data modeling to answer the questions, "Is this task data modeling?" and "Who is responsible for the task?" This survey raises questions and perceptions about the role of the data modeler in capturing requirements versus having information requirements presented to the modeler. Other surveys explored perceptions regarding the definition of data modeling as well as the characteristics of data modeling.









