Where Does XML Fit?
I have recently been working on the second edition of my book Data Modeling Made Simple, and the last chapter addresses several new challenges. Here is a subset of a recipe XML document to consider for this month's challenge:
So here's the challenge: Would you consider an XML document to be a logical data model or a physical data model (or neither) and why?
A Matter of Interpretation
Eleven percent of our challengers believe an XML document is logical, an equal number believe it is physical, 54 percent believe it is neither logical or physical, and 24 percent believe it is both logical and physical. The results are fascinating because the underlying reason for such a variety of responses was different interpretations of XML document, logical data model and physical data model (much the same way that different interpretations of the term Customer lead to different business decisions on customers).
Is this XML document accompanied with a schema? The schema, such as a Document Type Definition (DTD) or XML Schema Document (XSD) specifies the rules for the data in an XML document similar to the way a data model specifies the rules for the data in a database structure. Therefore, without the schema, the XML document represents an instance of data like an entity instance in a data model. Most of the responses in the 54 percent that voted neither mentioned that the schema is really the model. David Armour, architect, says, "The document and schema should be differentiated much like a model is differentiated from the records in the database." Database developer Philip Kelley says, "Your example is a recipe instance, perhaps how to make bread. This is great for describing how to make bread, but it's a sample - it's not a template or a design model, in that it doesn't describe all the options, restrictions and other criteria on how to properly build an appropriate document." Senior data architect Jeff Lawyer adds, "In the recipe example, one could infer entities Recipe and Ingredient, a one-to-many relationship between the entities Recipe and Ingredient ... but these are only inferences based upon a single, limited view of the Recipe business scenario."
Let's assume that the XML document in this example is accompanied by a schema. Would you then consider the XML document to be logical or physical? The answer depends on your definition of logical and physical data model. Alec Sharp, consultant, says, "As always, the first step is to understand the definitions for logical and physical being used, because the terms are interpreted very differently by equally skilled professionals. Briefly, my definition of a logical data model is that it contains all of the information needed by the physical designer to generate an untuned, first-cut implementation but is independent of any specific implementation platform. A physical data model is the transformation of the logical model into a specific implementation platform."
A number of responses, including the one from Wally Zaremba, solution architect, state an XML document is logical. Wally says, "At a minimum, they should include entities, attributes, the nouns, and relationships, the verbs. In the example given, I can see the nouns as recipe and ingredient. I can also infer, although it's not that obvious, a hierarchical relationship, the verbs between recipe and ingredient. We need to remember that a logical model is not always relational." Norman Daoust, business analysis consultant and trainer, says, "An XML schema can generally be translated into a logical data model. However, XML documents frequently only indicate the cardinality of relationships on one end of the relationship, not both ends. The example XML document indicates that a recipe is associated with many ingredients but doesn't include any indication of whether an ingredient can be associated with more than one recipe."
It is also interesting to note that a significant number of the 54 percent that voted neither explained that XML is not a data structure but a message processing and transport structure. Michael Smilg, systems consultant, neatly summarizes this: "I think of an XML document to be data in motion whereas persisted data stores are data at rest." Steve Turnock, database engineer, agrees. "It can provide a transport mechanism between services, but how would you represent a many-to-many relation? There are a lot of concepts that I need represented in a model for a relational database that XML does not provide." Senior system engineer Lee LeClair says, "It is a method of communication between applications ... While well-written XML can certainly provide insight into the data that needs to be stored, it should not be confused with a data model."
I initially considered an XML document a physical data model because it forces a technology-specific structure (i.e., hierarchy) on the business rules. After this challenge, however, I would recommend distinguishing the XML document as an instance of a schema, and then making your decision based upon a number of the insightful responses in this column, such as an XML document representing half the cardinality on a logical data model.