Canonical Data Model

Published
  • July 23 2008, 2:40pm EDT
More in

There are a bunch of new buzzwords popping up in our field. One term I hear quite frequently lately is “canonical data model.” I recently explained this term to a fellow Design Challenger, yet I was interested in other explanations as well.

The Challenge

Please define “canonical data model” and give an example.

The Response

My definition of canonical data model, expanded with input from our Design Challengers is as follows:

The canonical data model is the definition of a standard organization view of a particular subject, plus the mapping back to each application view of this same subject. The standard organization view is built traditionally using simple yet useful structures. Employee and Contractor, for example, might be represented as Person Role; Order and Credit as Event; Warehouse and Distribution Point as Site. The canonical data model is frequently implemented as an XML hierarchy. Specific uses include delivering enterprise-wide business intelligence (BI), defining a common view within a service-oriented architecture (SOA) and streamlining software interfaces.

Figure 1 is an oversimplified example of the use of a canonical data model. The “before” view shows point-to-point interfaces that each need to be aware of how the target system sees its world. The “after” view, on the other hand, knows how each system sees its world, and therefore can translate between any two systems.

I’d like to explore the boldfaced terms in this definition in more detail.

Standard. Bob Schork, metadata architect, states, “Canonical means the accepted and only acceptable standard of a system. Likewise, a canonical data model would be the accepted structure for an application system. It promotes reusability.” Claire Frankel, EDM manager, equates canonical data model with the reference or ruling data model and states: “When referring to an enterprise, the canonical data model is the basic or fundamental logical model of the firm’s business. When referring to data modeling itself, a canonical data model is one of the known, industry-standard models for that industry or business.” Ralph Nijpels, business analyst, mentions that canonical models typically have company-wide scope that describes terms, their definitions and their relations in the language of the business. Nandi Iyer, solutions architect, agrees: “The canonical data model unifies information fragments at an enterprise level to facilitate consistent data usage for enterprise integration.”

Mapping. Instead of writing translators between each and every application, it is sufficient just to write a translator between each format and the canonical format. Craig Jordan, advisor, offers this analogy: “Some nations are comprised of people who speak many different tribal languages. In these cases, a national language can sometimes provide a means for communication between tribes that is not prejudiced toward any particular group. In the realm of information systems, the data or information models that are specific to a particular application are tribal, and one that is independent from them all is canonical.”

Simple yet useful. Sathsh Parameshwara, BI architect, says that the canonical data model is a generic data model that can be plugged into any platform without any dependency on applications used. Lee LeClair, senior system engineer, states, “The term means a data model that conforms to acceptable practices and is in its simplest form.” Steve White, information architect, adds, “A canonical data model is one that’s abstracted, that is to say not linked to a specific application.” Jeff Lawyer, senior data architect, adds, “A canonical data model is an overall, basic and generally indisputable data model for an enterprise, sufficiently high-level enough to be boundary, organization and application independent.”

Hierarchy. Jeff Pekrul, data architect, says that a canonical schema can be a physical model that is typically an XML schema (i.e., hierarchical) and intended for use in data integration applications. He states, “Much of the confusion about the term ‘canonical’ relates to the distinction between canonical schemas - typically XSDs - and logical data models from which these may or may not be derived.”

If you would like to become a Design Challenger and have the opportunity to submit modeling solutions, please add your email address at http://www.stevehoberman.com/. There is also an overview on how to read a data model at my Web site.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access