Continue in 2 seconds

Searching for the Right Data Modeling Tool

Published
  • June 01 1998, 1:00am EDT

With so many products available that support data modeling, the challenge is in selecting the right one. The following topics serve as the evaluation criteria that I use in selecting data modeling software.

Quality Graphic Presentations

The diagrams exist for one reason: as a mechanism for communicating with other people. The diagrams are essential for use by the business people in validating that their business rules have been properly represented and by developers to assist in gaining knowledge about the requirements for the systems they design. If a data modeling tool is rated low in the quality of its diagrams, I immediately eliminate it from further consideration.

Given the importance of diagrams, a data modeling tool should provide access to the entire range of graphics capabilities, such as color, fonts and object sizing. These features can be used to enhance the diagram's ability to communicate important aspects of the model. For example, the entities and relationships on a data model can be color coded to identify the business area's subject areas; the primary entity in a view can be easily focused upon by increasing its object size and using a thicker pen; the attributes which represent the foreign keys propagated from related entities can be highlighted through color and font size.

Another aspect of graphic presentations is the ability to present alternate views of the same diagram. A data model can be cut into subset views. The user can control what information is displayed on the diagram. For example, in some situations, it is best to include the entity's description in its box. Under other circumstances, we want to see the attributes listed. The ability to easily switch between these various perspectives should be provided.

Ease of Use

If I judge the tool's diagrams to be adequate, I then assess how easy it is to use. For me, a tool is easy to use if it can adapt to my preferred style of interaction. Some modelers prefer the graphic approach in which the model is initially specified by drawing the diagram. However, I prefer to enter the model textually with the diagrams generated by the tool, as much as possible. Once the tool has generated the initial diagram, I can then refine the diagram layout manually. I look for tools which can support both interface modes.

A tool should be very forgiving of mistakes. The ability to move attributes from one entity to another, reroute relationship lines easily and change object names with no loss of information is essential. Unfortunately, I've encountered too many tools which use the delete/reinsert approach for correcting mistakes. This approach is often accompanied with the loss of textual documentation associated with the deleted object. Any product which requires the re-entry of data to correct mistakes does not qualify as a productivity tool.

The tool should provide meaningful error messages with options for correcting the error, when possible. A robust help facility, both on-line and in hard-copy format, adds greatly to a product's ease of use.

I pay particular attention to the facilities provided to support relationship lines. I find that the most time-consuming aspect of creating a quality diagram involves the layout of the model's symbols. Configuring the diagram so that the relationship lines are easy to follow with minimal line crossovers can be a challenging activity. Drawing the model is simple. Creating a diagram that is pleasing to the eye is an entirely different matter. I look for tools that give me the maximum control over the relationship lines. I should be able to connect the relationship line to any spot on the entity's box. As entities are moved about the diagram, the tool should be able to automatically route lines through a path that avoids other relationship-line crossovers and travels around entities, rather than through them.

Meta Data Management Facilities

While the diagrams are the primary communication vehicle during the modeling sessions, the model is not complete without its textual documentation. An essential feature of any data modeling tool is the robustness of its meta data management facilities. The meta data capture and maintenance facilities should be seamlessly integrated with the tool's drawing environment. Typically, double clicking on a diagram's symbol will bring up the documentation window for the associated meta object. Many tools only provide the ability to enter large blocks of text as the supporting meta data. I prefer products which provide a rich set of properties that allow me to capture information meaningful to each individual meta-object class. The accompanying report-writer facility should allow me to specify which properties are incorporated into each report.

While corporations tend to select one business analysis and application development methodology as their standard, most find it necessary to augment the methodology, primarily in the area of documentation standards and naming standards. Documentation standards include the properties which are captured about the individual objects included in the model and the format of analysis and development deliverables. Naming standards deal with the rules for constructing names for the analysis and development objects which attempt to ensure uniqueness and, occasionally, ownership. Documentation and naming standards often reflect the personality and culture of the corporation. Therefore, the enforcement of corporate standards is often the area which requires the greatest degree of customization within the supporting toolset.

Most methodologies consist of a rich set of rules for constructing proper diagrams in terms of the types of objects which can appear on the diagram, the meaning of each symbol of the notation and how the diagram objects can interrelate. Unfortunately, similar care has not always been taken in what information should be documented about those objects, beyond name and description. While most CASE vendors provide the ability to capture additional properties about a specific methodology object, I find that none have anticipated all the information that I want to capture about those same objects. Therefore, the ability to extend the set of properties which can be captured about the meta objects in the methodology is a very desirable feature. Otherwise, other means for capturing this information must be used, such as the corporate repository or a word processor. Every time a different tool must be used to fully document a single methodology meta object, the more time is required to re-integrate this information when the documentation which supports the diagrams is generated.

Similarly, naming standards are an important aspect in data administration's efforts to control the proliferation of alias names for the same business fact. Since it is less time-consuming to reuse an existing field definition instead of defining the field each time the business concept is encountered, naming standards are an important productivity tool in application development.

Naming standards normally include the use of a standard abbreviation list for constructing data element or column names. These standards exist because many languages limit the length of the names that can be assigned. Although the process of developing the standard field name from an attribute name is quite simple, few CASE tools incorporate this feature. Yet abbreviating names is one of the more time-consuming and boring tasks that a developer faces. I found that an average of one minute is required to manually construct a valid column name, using a paper-based standard abbreviation list. Once I automated the process, less than three seconds was spent abbreviating the name. An added bonus was the automatic generation of a list of terms for which abbreviations needed to be developed. Through the automation of naming standards, I saw a 2000 percent increase in productivity. Yet none of the CASE tools that I've encountered have included the ability to construct standard names.

The tool should provide the ability to easily navigate through the linkages between model components. For example, double clicking on an attribute in the entity's attribute list should bring up that attributes meta data. Ideally, the selected symbol on the displayed diagram should change to the meta object whose meta data window is currently displayed.

The ability to produce customized reports in a variety of formats is essential. A report writer should be provided which can be used to create reports "your way." You should be able to specify which meta-object types and properties are included in the report, as well as the formatting of the layout, including fonts and colors. The ability to generate matrices and indented lists which show the relationships between model components is also a feature to look for. An important feature of any report generator is the ability to select which model components should be included on a specific report. A variety of selection criteria should be supported, such as an entity's neighborhood, for a specific set of subject areas by attribute usage or those model components that were changed in the last modeling session. You should be able to save your custom report specifications and selection criteria. Some tools provide the ability to specify complete deliverable packages which can be used to generate documentation consistently for any project. Tools which do not provide their own report writer, should provide access to their databases through a third-party report writer product.

The tool should support an open architecture. Minimally, an import/export facility should be provided to allow the movement of data between other application development productivity products. The format of the import/export file layout must be published. Ideally, the tool will support industry standards, such as CDIF and the Meta Data Coalition Interface Specification.

Model on the Intranet

Today, the intranet is one of the primary technologies in use for delivering information to an organization's users. Consequently, the ability to publish a model's diagrams and meta data to the Web has become an essential feature. I look for all the capabilities provided through the CASE tool's native interface: click-able diagrams, which bring up the relevant meta data window for the selected symbol; and hyper-links, which allow me to navigate between the model components. In addition, I want to exploit the Web's ability to support threaded feedback and discussions on the model. The Internet provides a vehicle which can allow users across the organization, who would never have been exposed to the models, to view and critique them. I want to encourage this type of discussion around the models' content. Therefore, facilities should be available to collect, assimilate and respond to model-related messages. A fully functional Web-based client which can be used to develop and maintain models is also a desirable feature.

Application Component Generation

A CASE tool is just a documentation tool if it does not have the ability to generate application components as their specifications are recorded in the CASE tool. One problem that has always plagued developers is the fact that applications tend to get out of synch with their documentation. All too often, changes are made to the application's components which never get reflected back into the documentation supporting that component. As one of my associates stated, "To ensure that the documentation matches the code, you must generate the code from the documentation."

Any data modeling tool should be able to generate database components for your target technology, such as tables, columns, views, triggers, stored procedures and table spaces. Since many companies develop for multiple platforms, the ability to generate database components for different technologies from a single specification is required.

Full Life Cycle Support

I look for data modeling products which are part of an integrated tool suite that provides support for the entire Zachman Information Systems Architecture by providing automatic links across the columns and automated assistance as one row is transformed into the next perspective.

For example, I would like to be able to automatically cut off a view from the data model, based on the attributes which were linked into the data flows defined in a process's data-flow diagram. I would like to have a list automatically generated of attributes defined in data flows which have not been yet assigned to some entity in the data model. I want the tool to provide alerts of inconsistencies across a row, as well as within a specific cell, of Zachman's matrix. Furthermore, when the time comes to transform the models of one row into the next row, I would like as much automated support as possible. DBAs could use a product which recommends alternative database designs based on information captured about access paths, performance volumes and statistics. Likewise, a tool with the ability to maintain the transformation links from the business data model to the conceptual data model to the denormalized database design would be highly favored within the data management community.

Enforcement of Methodologies

If a product advertises that it supports a specific analysis and design methodology, then it should enforce the rules defined by that methodology. Enforcement should be implemented in a manner that aids, not punishes, the user. For example, suppose I am using a tool that supports IDEF1X for data modeling, I place two Independent Entities in the model and define an identifying relationship between these two entities. The tool should ensure that the child entity in this relationship is a Dependent Entity.

A tool is a valuable assistant if it catches the methodology errors in real time and provides the options for rectifying the problem. Upon selection, the tool does all the processing required to implement the choice, thus ensuring that the rules of the methodology are adhered to. Since it can perform these corrections faster than I can, my productivity is increased and the quality of the model is preserved.

If a tool supports multiple methodologies, then it must ensure that it fully enforces the rules for each methodology. Also, the algorithm used by the tool to convert diagrams from one methodology to another must be documented, complete and, ideally, customizable.

If a tool claims to support relational modeling, then the propagation of foreign keys should be automatic. Nothing annoys me more than spending time manually linking the same attribute to all its related entities when the tool has all the information needed to perform this propagation. Attributes which represent the foreign keys propagated from another entity should be noted as such in the model.

While I believe that strong enforcement of the methodology rules is important, there are occasions where this requirement can be relaxed. On occasion, for presentations I develop simple drawings whose purpose is to convey some important, high-level concepts in which the precision of the methodology notation can get in the way of the message. For these types of diagrams, I consider the ability to turn off rule enforcement essential.

Model Management Capabilities

A robust data modeling product must support the need to reuse model components across multiple business analysis projects. Also, the ability to partition models into chunks which can be worked on by different analysts for different business areas must be supported. Likewise, a data modeling tool should be able to compare different versions of models and identify how the model components diverge. Finally, the ability to merge models must exist. If the data modeling tool does not support these types of model management facilities, then it should be able to integrate into a separate model management facility which provides these types of capabilities.

Data Reverse Engineering Capabilities

Any tool ranks high in my book when it eliminates the need to rekey information that already exists electronically. Therefore, an important productivity feature for any data modeling tool is the ability to reverse engineer data structures from existing applications. Companies rarely develop entirely new, stand-alone applications. In most cases, they are replacing an existing application which must co-exist with the other applications in the portfolio. Therefore, the ability to bring those existing data structures into the forward-engineering development environment as the basis for developing interface and conversion specifications is required.

My data modeling tool evaluation criteria represents, to a degree, my wish list for the perfect product. At this point, no single tool satisfies all my requirements. But when I reflect on the functionality provided by those initial products that were released a decade ago, our vendors have shown the ability to constantly improve their CASE tools. Each new release reflects a product that is closer to my version of nirvana.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access