Logical Data Modeling: A Key to Successful Enterprise Data Warehouse Implementations
Information Management Magazine, September 2006
Increasingly, large enterprises are recognizing the value of an enterprise data warehouse (EDW) in their information and knowledge strategies. The potential benefits include cost-effective consolidation of data for a single view of the business and creation of a powerful platform for everything from predictive analysis to near real-time strategic and tactical decision support throughout the organization.
Too often, a company's struggles with designing and implementing a logical data model (LDM) are the source of the problem. The LDM is of vital importance, largely because other key components of enterprise data management rely on it. It is especially disturbing that the frustrations around LDM implementations are often preventable, having more to do with people, process and organizational concerns than with the technology itself. If companies can better understand what is tripping them up when they create and implement LDMs, they will realize the full potential of their data warehousing projects much more quickly.
Why the LDM Matters
An LDM is a representation of business concepts laid out in a visual format that clearly shows these concepts and their various relationships. It is independent of the underlying database implementation. Designing an LDM that fits the needs of the business is crucial, not just because it reflects the commitment to treat data as a true enterprise asset but also because it enables efficient and effective storage of data that businesses can readily access to create various information and knowledge products. Conversely, a poorly designed LDM negatively affects many EDW components, making rework quite expensive. The LDM affects:
Advertisement
- The design of data movement techniques and standards, including extract, transform and load (ETL) and enterprise application integration (EAI), degree of normalization, use of surrogate keys and cardinality;
- Reference data;
- Business and technical metadata and the metadata repository; and
- Packaged applications, such as enterprise resource planning (ERP) or HR systems, which have their own LDMs that, in some instances, must work together with the overall enterprise model.
LDMs also serve as a foundation for data quality. Models that don't follow first normal form or have the wrong relationships often store duplicate data, resulting in loss of data quality. Proper modeling of items such as domains and data types helps validate data quality checks. LDMs also must comply with data governance guidelines and any overall data standards in the enterprise. The amount of history stored in a warehouse depends on the design of LDMs and physical data models (PDMs), thus influencing the storage strategy.
The Data Modeling Process
Building an EDW environment is not unlike building a house, which often includes a three-step modeling process that begins with a sketch showing the various rooms. This initial plan drawing does not include detailed construction-specific information such as the type of materials needed, but does include details such as room dimensions, plumbing details and electrical details. Think of this first sketch as the LDM.
When building a house, the architect hands the sketch over to a construction engineer, who creates detailed plans to construct the house. The construction engineer must apply his knowledge of materials and other factors to narrow down the design choices. The end product is an engineer's blueprint that specifies the various materials. For example, if he is using copper tubing for hot water plumbing, this drawing would specify the gauge and diameter of the pipe, among other things.
Finally, there may be many people living in the house with their own expectations about features and functionalities. Also, the same underlying feature may have different applications across the house. For example, even though water is used throughout the house, in the kitchen you need a standard sink and faucet with hot and cold water, in the bathroom you need a shower and sink, and in the swimming pool you only need cold water with an additional filtration/chlorination system. Just as you don't build a plumbing system to satisfy one room or person, you don't build a data infrastructure to satisfy the needs of just one business user. The semantic layer acts as the interface to the specific uses and needs of a business and uses the language/semantics that are understood by specific business areas.
Actual Modeling Processes
Logical data models: In an ideal scenario, the development of an LDM begins with a broad set of data requirements for the EDW. Using the requirements along with interviews with subject matter experts (SMEs) creates a conceptual model showing the relationships between key entities.
The next stage is detailed attribution, where you assign primary keys and verify the cardinality, optionality and identifiability of the various relationships. Detailed attribution is created based on the data available in the enterprise for specific applications. This involves mapping data elements in the source environment to the LDM. Further, with the help of SMEs, you also develop metadata (including domains, data types, definitions, comments and notes). Following this, peers and SMEs typically review the model. After review and, if necessary, revision, the model becomes available for the next step in the modeling continuum: the PDM.
Physical data models: For PDMs, data architects create index structures, perform selected denormalizations, create aggregations/summary tables and conduct performance tuning. They also handle staging area tables as part of physical data modeling. Because architects optimize PDMs for the underlying database architecture, PDMs are not easily interchangeable across database platforms.
Semantic data models (SDMs): Designed for specific end-user applications, SDMs give business users a view into the database that reflects specific business area semantics or terminology. SDMs are a layer on top of the PDM that uses database views as well as additional summary or other minor tables. If companies use reporting or OLAP tools, the semantic layer sometimes reflects those tools' characteristics.
Five Key Challenges to Designing and Implementing an LDM
Page 1 of 3.







