Free Site Registration

Is Data Modeling Still Relevant?

InfoManagement Direct, May 28, 2009

Bert Scalzo

Some people believe data modeling has become very passé these days. The belief is that because data modeling theory is more than 30 years old and, because some data modeling tools have been around for 10 to 20 years, somehow data modeling is no longer relevant. Nothing could be further from the truth. In fact, data modeling may now be more necessary than ever before.

While there are other modeling techniques and notations, such as business process modeling and Unified Modeling Language, the need to accurately capture business data requirements and transform them into a reliable database structural design is as paramount as ever. The key differentiator is that data modeling is the only technique and notation that focuses on the “data at rest.” All the others tend to focus more on “data in motion.” Put another way data modeling concentrates on issues that lead to a solid database design, while others approaches tend to focus more on issues that will result in better application design or things useful to programmers, such as data structures, objects, classes, methods and application code generation.

Case in point: I’ve personally served as an expert witness in several court trials where plaintiffs sued defendants for serious financial remuneration when custom database applications had performance and/or data accuracy problems. In every case, there was a failure to data model the business requirements. Thus, the data effectiveness suffered. Moreover, ad hoc database design, or database design using more programmatic-oriented techniques and tools, often resulted in inefficient database design. No amount of coding could overcome the resulting bad database design. So, in every case, the plaintiff won.

The other reason data modeling has seen measurable resurgence is the data warehousing phenomenon. With cheap storage these days, most companies can afford, and benefit from, retaining historical aggregate and/or summary data for making significant strategic decisions. With the accumulation of numerous source legacy online transaction processing systems, there are two key ways to approach populating a data warehouse: directly from source to warehouse (as shown in Figure 1) or through an intermediary database often referred to as an operational data store (as shown in Figure 2).



Sufficient debate exists as to which approach is superior, but I won’t address that here. Regardless of which approach is selected, the database design (i.e., the data at rest) is paramount because, in a data warehouse, the data itself - and the business information it contains - is the most relevant and valuable asset. Typical data warehouse queries and reports issued via business intelligence tools process that asset to yield strategic decision-making results.

The other key area where data modeling often supports the whole data warehousing and BI effort is the mapping of legacy data fields to their DW and BI counterparts. This metadata mapping about how frontline business data maps to the data warehouse helps with the design of both queries and/or reports, as well as with extract, transform and load programming efforts. Without such mapping, there would be no automatic tie to the dependent data warehousing information as OLTP legacy systems evolve. Hence, one would have to almost totally re-engineer rather than simply follow the OLTP source data ramifications and ripples downstream to the DW and BI endpoints. 

For those not involved with data warehousing projects - perhaps those performing more traditional OLTP-type systems development - data modeling still is important. Often, however, people get so caught up in novel paradigms such as extreme programming, agile software development or scrum that they compromise data modeling, or even skip it entirely. The problem is that these new approaches don’t always spell out exactly how data modeling should be incorporated, so people often forego it. 

My belief is that no matter what latest and greatest approach you use, data modeling should be integrated into your development process wherever it makes sense. Figure 3 shows how both conceptual and physical data modeling should fit into an overall database design process - whether it’s for a totally new system or for one that’s being updated or re-engineered.


There is one final reason why data modeling has been getting more attention these days. In many cases, organizations finally are requiring data models as a sign-off deliverable of the development process. I attribute this to their attempt to adhere to the Software Engineering Institute’s Capability Maturity Model and Capability Maturity Model Integration concepts. The idea here is quite simple: to mature your development process regardless of technique, you need to develop in terms of both the processes and tools used to achieve the desired better end result. Both processes and tools can lead to maturity, helpig to reinvigorate many peoples’ interest in data modeling.

Advertisement

Page 1 of 2.

Advertisement

Advertisement