Building the Legacy Systems of Tomorrow
"Do you know one of the great things about XML?," a lead programmer almost whispered to me a year or so back. "Tell me," I said. "Well," he continued, "You can embed XML in existing character columns in databases, so you can carry a lot more information in the database without ever having to add any new columns. Think of all the time and effort you can save doing that!" The enthusiastic, almost wild, gleam in his eye told me that he was not going to listen to any sermons on normalization. Perhaps DBAs and IT departments could do a better job of allowing changes to database structures, but nothing can justify using XML to circumvent these controls. Unfortunately, this is not the only abuse of XML. Indeed, XML seems to put temptation in the way of programmers because it lends itself to two things that many, if not most, programmers love: hierarchies and parsing strings.
The hierarchical nature of XML is strangely reminiscent of the now almost forgotten days of hierarchical databases such as IMS. Program logic tends to work in hierarchies, with one routine calling another, stacks and heaps, call traces and so on. But just because programmers organize logic in hierarchies does not mean that hierarchies are good for everything. Indeed business rules are atomic pieces of logic that typically have dependencies that resemble networks more than hierarchies. Trying to fit business rules into XML structures, which I have seen attempted, may satisfy the artistic purity of many programmers but is unlikely to achieve anything practical.
The love of parsing strings is another bad habit that XML enables. Rather than pass individual parameters to subroutines or functions, huge payloads of XML are exchanged that then have to be carefully unraveled to extract relevant parameters. Prior to XML, it took discipline to have individual parameters each with a specific ordinal position (or name) and data type. This has suddenly been replaced by single parameters of ever-growing and changing character strings of XML that can make spaghetti code look appealing by comparison. Trying to figure out what is being passed between routines is now a quantum leap more difficult. It is true that there are aids that help in doing this, but they are more than offset by the poor design that XML enables.
Included in this is the "need" for serialization and deserialization of XML as it is exchanged. Knowing how to perform this clever trick seems to be presented as something that separates novice programmers from the cognoscenti of the craft and is rarely discussed as a flaw that reduces performance, adds to unnecessary maintenance and provides a breeding ground for bugs.
One of my chief worries with building business rule engine functionality is performance. Users dislike poor performance, even if they are being provided with great functionality. Navigating the hierarchies of XML, parsing XML and doing data type conversion can potentially degrade performance. If I have to use XML, I will, but I want to make it as lean as possible.
Meta Data without Meaning
One of the great promises of XML was that it made data "self-describing" because the meta data that accompanied the data would describe what it meant. Unfortunately, this has not been the case on the many projects I have seen. I constantly run up against XML that is like:
Basically, the meta data is treated like surrogate key values. It is something cooked up by a programmer to uniquely identify a piece of data, but it is next to impossible to find out what business reality (if any) it corresponds to. The same data can be represented by different meta data in different XML, because the meta data is simply being treated as a way of differentiating pieces of data being passed between two points in an application.
When it comes to business rules, I have found that different classes of rules have different kinds of meta data that apply to them. I prefer to see these different kinds of meta data represented as tables and columns in a repository. Yet I have also seen them represented as XML embedded in rule definitions. This hides the properties of the meta data needed for the rules and cements them in inflexible XML hierarchies. Having the meta data exposed for general use in a relational database (a repository) is a much better approach.
Of course, I have to recognize that a lot of tools and techniques have now been built around XML, and the original arguments in favor of XML are still valid. There is no doubt that XML can be used successfully, and indeed we may be obliged to use XML in certain tools. However, it seems to take greater discipline to achieve success, and some issues with XML, such as bandwidth consumption, may never go away. Perhaps the greatest lesson is not to use XML in business rules projects just for the sake of XML and to understand that it does not eliminate the need for careful analysis and design.
Malcolm Chisholm, Ph.D. has over 25 years of experience in enterprise information management and data management and has worked in a wide range of sectors. He specializes in setting up and developing enterprise information management units, master data management, and business rules. His experience includes the financial, manufacturing, government, and pharmaceutical industries. He is the author of How to Build a Business Rules Engine and Managing Reference Data in Enterprise Databases and Definition in Information Management. He writes numerous articles and is a frequent presenter on these topics at industry events. Chisholm runs the websites http://www.bizrulesengine.com, http://www.refdataportal.com and http://www.data-definition.com. Chisholm is the winner of the 2011 DAMA International Achievement Award.