Thanks to recent technological advancement in storage technologies, businesses today are gathering and storing more data than ever before, and this growth has been exponential in last few years. Trends indicate that data size in only going to increase in the future with unabated increases in growth rates. The single version of the truth phrase has been used by analysts, industry experts, vendors and consultants numerous times, as if it is a panacea for all data warehousing and business intelligence (BI)-related issues. We all tend to forget that there is no silver bullet that can deliver a single version of truth with 100 percent success.
We have reached the stage where enterprises understand the urgent need to establish a single version of truth and they have just begun to add new architectural concepts such as master data management (MDM), customer data integration (CDI), enterprise information integration (EII), etc., but they are still drowning in an ever-growing population of data silos at physical levels and information silos at the logical level. Information silos get created when different business units in an organization develop their independent understanding of the same data assets.
The newest tools and technologies have a much higher probability of failure if organizations do not develop a long-term vision for enterprise data management (EDM), which should include people and process along with the data assets. The most critical process of EDM is data governance, which is the key to continued consistency and accuracy of data and information in an enterprise. EDM is too daunting of a task, especially in large organizations, to cover the whole enterprise at the same time. An evolutionary approach is recommended; start small and build it incrementally. Enterprise standards for naming logical and physical data assets are one of the foundation components of EDM, which facilitates and supports data and information governance efforts in any organization. As with most processes of EDM, it is recommended to start small and follow a build as we go approach for enterprise-naming conventions for data and information assets.
Why Naming Standards?
The core objectives of building and implementing naming standard in an enterprise are:
- Business as well as technical users should be able to describe any data entity or data element just by looking at its name. Users can be internal as well as external (vendors) to the organization.
- The name decided by more than one professional for an entity or a data element should be same if they are exposed to same business and technical descriptions of the data asset.
Describing and naming data correctly is critical. If it is done right, it can help an enterprise:
- Minimize misunderstandings among business functions, which can reduce the amount of total effort needed in a BI/DW project.
- Facilitate operational efficiency and strategic use of the data.
- Reduce time to introduce new products in market.
- Set, describe and achieve common business goals.
- Improve customer satisfaction.
Key Components of Data Naming Standards
Class words: Class words help classifying entities and attributes in broad categories. There are two types of Class words:
Entity class: Entity in a logical model corresponds to a table in physical model. Each entity is assigned to one entity class based on its primary business intent, e.g., asset (AT), document (DO), event (EV), location (Location), party (PA), rule (RU), structure (ST), transaction (TR), etc. Generally full name of an entity class should be used in a logical model, and it should be abbreviated to a two to three character code for naming a corresponding table in a physical model.
Attribute class: Attributes in a logical model correspond to columns in physical model. Each attribute is assigned to one attribute class based on the business function supported by the attribute. Attribute classes are closely and directly linked to column domains such as name, address, quantity, code, etc. in a logical model, which in turn defines the data type in physical model, format and kind of values that may be stored in the associated column. Attribute classes can be built up to any level in any organization depending upon the requirement. Organizations should have a library of attribute class words under at least the following major categories:
- Chronology represents a point in or span of time.
- Measurement represent capacity, quantity or count.
- Identification identifies a person, place or thing.
- Text identify free form or narrative data.
For example, an attribute class could be quantity (QY) or could go up to the level of units, volume, weight, etc.
Prime word or base noun: It identifies the application and subject area, major data category or model name, depending on the data object being defined. It may consist of a single word or phrase. E.g., account (ACCT), budget (BDGT), organization (ORG) vendor (VNDR), Transaction (TRANS), etc.
Prime words assignment, if done correctly, can also help in establishing the first level of data stewardship.
Modifier or qualifier: It defines and distinguishes prime and class words. It further describes the data object (entity and table) and attributes (column) beyond their classes and prime words. E.g., Employee-name versus Employee-First-name, where first is a modifying word.
Constructing Names in Data Models
You should build a library of entity classes, attribute classes, prime words and modifiers and their abbreviations before you actually start building names in a data model. It is generally not possible to build a comprehensive and completely mature library in the beginning of enterprise modeling efforts; a starting library should be mature enough not to require very frequent changes in the future. The library of class words and prime words should be built first and should be the most mature among all other libraries.
In a typical organization, logical modeling is done first and then physical data model is derived from logical modeling, depending upon the physical database infrastructure targeted in an organization.
Names in logical physical data models should:
- Be meaningful.
Get access to this article and thousands more...
All Information Management articles are archived after 7 days. REGISTER NOW for unlimited access to all recently archived articles, as well as thousands of searchable stories. Registered Members also gain access to:
- Full access to information-management.com including all searchable archived content
- Exclusive E-Newsletters delivering the latest headlines to your inbox
- Access to White Papers, Web Seminars, and Blog Discussions
- Discounts to upcoming conferences & events
- Uninterrupted access to all sponsored content, and MORE!