The Roots of Denormalization
For many years the principles of good database design have been well understood. Getting databases into third normal form, for instance, is a goal for practically every data modeler. Achieving good database design is a different matter. First, data analysis can be difficult, and a true understanding of a business subject area may only come after a database goes live. The result may be changes to the design that include short-cuts that trade off good design against keeping an application up and running. Database designs can also get bent out of shape when programmers apply pressure to change a good logical design to something that is easier to for them to deal with. Every data modeler I have spoken with has had battles of this kind with programmers. Programmers often invoke the specter of poor performance to support their views, although their real reasons may be a little more obscure. In any event, the rush to implement an application means that "do it quick" usually beats "do it right." The result is that most database designs are denormalized with such things as repeating groups in database tables, multiple redundant implementations of the same database column, columns that contain more than one piece of information, relationships that exist for some records in a table but are meaningless for other records and so on.
What is a Business Rule?
The answer to this may be, "So what?" All enterprises have production applications working with databases that lack the purity of a true normalized design. If programmers asked for a denormalized design, then the programmers have built the application around it. Maybe the programmers had to do more work, but they obviously preferred to do this than deal with a fully normalized database. If the database design was changed after the application went live because of an original design flaw, the fix that was put in might be imperfect but at least it works.
The problem with doing things in this way - effectively committing to support a denormalized design - is that an application ends up with more logic than is contained in the underlying business subject area. A lot of additional logic is needed just to do processing around the denormalized structure of the database. Because application logic is a black box whose inner workings are not easy to see, it is difficult to appreciate just how much of this kind of logic an application contains. Also, it is difficult to quantify how much time programmers spend on implementing real business rules versus creating program code needed purely to work with a denormalized database.
This means that an application built by hand crafting program code contains one set of logic for business rules and another set of logic for dealing with denormalization. It may even contain other logic for dealing with such things as spaghetti code, and the software used to build and run the application. All of this logic is often lumped together as "business rules", but in reality only a portion of it has anything to do with the business.
Applying Business Rules Approaches
One of the consequences of denormalized database designs is that documenting the business rules around an application becomes tricky. Even if the business community supplies all the business rules, these will not represent everything that is going on in the application. Similarly, if programmers reverse-engineer the application, it is very difficult to decide which "rules" they extract really come from the business subject area and which are just there to make the application work. The problems that originate with denormalized database designs may mean that projects designed to document business rules end up with results that fall short of what was expected.
Implementing business rules engines can be an even more daunting proposition. Many rules engine products can be attached to existing databases, but the engines have to be "trained" to read from and, if necessary, update these databases. Denormalization inevitably means that additional "non-business rules" have to be defined in these engines. Again, the only practical way to find out what these "rules" are seems to be by reverse-engineering existing applications. After that, the rules have to be defined and tested. It all adds up to a lot of extra work and risk.
What can be done about all of this? Ideally, the possibility of using business rules approaches should be a driver for always creating as good a logical database design as possible. Such a design represents how the business sees its data and needs to change only as the business changes. If such designs are physically implemented, then business rules projects are bound to be made easier. For existing databases there is a need to understand the difference between the logical design that represents how the enterprise sees its data and the physical design that is actually implemented. In addition, understanding the set of application logic that navigates around the denormalized design is required. This may not be an easy answer, but knowing what needs to be done from the start is far better than embarking on a business rules project only to get bogged down later on.
Malcolm Chisholm, Ph.D. has over 25 years of experience in enterprise information management and data management and has worked in a wide range of sectors. He specializes in setting up and developing enterprise information management units, master data management, and business rules. His experience includes the financial, manufacturing, government, and pharmaceutical industries. He is the author of How to Build a Business Rules Engine and Managing Reference Data in Enterprise Databases and Definition in Information Management. He writes numerous articles and is a frequent presenter on these topics at industry events. Chisholm runs the websites http://www.bizrulesengine.com, http://www.refdataportal.com and http://www.data-definition.com. Chisholm is the winner of the 2011 DAMA International Achievement Award.