The occasional bout of insomnia inevitably means watching late-night television. As time ticks by, the advertisements get longer and are for a more eclectic range of products, seemingly designed for an audience that might not be able to assimilate more sophisticated messages. One of my favorites is the pitch for Black Flag's "Roach Motel" product, a bait trap for cockroaches that is marketed using the slogan "The roaches check in, but they never check out."
This advertisement has a strange resonance with some of the ways we tend to do things in data administration. Are we guilty of building processes, services and infrastructures that serve to trap knowledge that can never get out again?
The Data Model
In some ways, a data model has an uncanny resemblance to the roach motel. A data model is a very large batch container. It is usually thought of as a thing in itself that is managed through defined processes. Indeed, projects are often predicated on waiting for an entire data model to be finished, rather than expecting a steady stream of completed entities, attributes and relationships.
Many processes and standards reinforce the reality of the data model as a batch container. One is to begin a data modeling project with the expectation of having a formal review and sign-off process after the model is completed. At first glance, this sounds eminently reasonable and maps to how we do many other things on projects. Frankly, it is easier to think of a data model as a single thing, and it is definitely easier to manage a single thing than the diverse set of metadata concepts and instances that exist within a data model.
This approach is by no means universal, and many projects do work with intermediate deliverables from models. However, it is not uncommon, and when it happens it forces everyone to wait until the data model is "complete." This may take months, and the formal review process usually adds more time because there is a lot to digest in a data model. In the intervening period, the things that are of use within the data model are not used by those who may need them. The data model becomes a knowledge trap.
Of course, not everybody runs a data analysis project like this, but in the data world, it is fair to say that there is considerable acceptance of data modeling as a core competency. This mode of thought decouples the process of data modeling from the goal of continuously sharing knowledge about data within an organization. To put it another way, the problem arises by looking inward and valuing a data model only in data modeling terms, rather than looking outward to the rest of IT and the business beyond, and trying to pump out knowledge about the enterprise's information resources.
Another unhelpful facet of data models is that they are typically created in tools whose licensing requirements restrict them to a very small numbers of individuals. Furthermore, these tools utilize notations that are not intuitive. This means that anything that goes into a data model is going to be inaccessible to anyone other than a data modeler.
The reality is that knowledge gleaned about data is going to be put first into a data model and nowhere else. If the data modelers are asked why they do this, the reply is often a puzzled look and a retort that this is what data models are for and there is no other approach.
The value of a logical data model is that it represents the data as the business truly sees it. Acceptance of this viewpoint means that a logical data model is intended for knowledge sharing. The analysis to get the normalization, cardinalities, optionalities, definitions, etc. requires a considerable effort, and focusing on the concepts and tools required to produce these artifacts is perfectly reasonable. What is not reasonable is to lose sight of why all this is being done. Producing the logical data model and doing nothing more than tossing it to the individuals who will make it physical is an enormous waste.
The content of logical data models must be shared, not just after they are complete, but as they are being produced. Unfortunately, data models are utterly useless for knowledge sharing except among a tiny group of specialists. If data administration is to be successful in the future, it urgently needs to tackle these issues.
There are other approaches, and we all know about the metadata repository. In theory, data models should contribute their metadata to an enterprise-wide repository. However, herein lie more issues. Firstly, data models have the useful property of being project-level artifacts. An enterprise-wide repository is a different beast. It needs to exist at a higher organizational level and be part of a sustained program. Not only does it have to deliver value across the years, it has to deliver it widely across the enterprise.
This approach is often simply ignored. IT staff often set up repositories from a perspective of what "best practices" or some higher form of data modeling should dictate. They rarely go out to the business and build use cases for atomic pieces of functionality to deliver (or manage) knowledge about the enterprise's information assets. This leads to a high probability that repositories will not be used. It is perhaps fear of this that leads to access to repositories being restricted to IT staff.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access