MAR 1, 2011

Related Links

New Product News – May 24, 2013
May 24, 2013
Are Social Networks an Effective Business Communications Tool?
May 24, 2013
Blue Coat Plans Big Data Security Buy
May 23, 2013

Web Seminars

Apache Hadoop Just Got Simpler
Available On Demand
The Big Deal About Big Data Governance
Available On Demand
Modeling Unstructured Data
Available On Demand

Data, Truth and Ethics

MAR 1, 2011
Print
Reprints
Email

I encountered an interesting problem a little while ago while working with a group of business users who wanted to be able to account for a trade in a certain way in order to report it.

One way of handling an incorrectly booked trade is called the "cancel/correct" approach. In this method, the original trade record is flagged to indicate that it has been canceled, and certain attributes are updated to provide additional information about who did this and why. After that, a new correct trade record is generated. The users wanted to express this activity in terms of credits and debits, and the cancel/correct approach did not fit their needs. They wanted to keep the original record - let's say it was a buy trade - without marking it as canceled. Then they wanted to create an offsetting sell trade to net everything out to zero. Finally, they wanted to create a second buy trade record that correctly represented the trade.

Years ago, I would have taken these "requirements" and cheerfully implemented them. Today, I have serious ethical misgivings about this kind of nonsense, having spent almost 20 years working on securitization technology. Given the consequences, there is an ethical dimension to how we manage data, and its basis lies in how we think about truth.

The Correspondence Theory of Truth

Aristotle defined truth as follows: "To say of that what is that it is not, or of that what is not, that it is, is false, while to say of that what is that it is, and of that what is not that it is not, is true" (Metaphysics, Book IV). Admittedly, this is a mouthful. Suppose I apply Aristotle's definition as I think of another Greek philosopher, Socrates, and I make the statements shown in the table. Two of the statements are true and two are false, and we can easily see how the correspondence theory of truth works.

Aristotle's definition of truth is easily applied to data. If data represents the reality it is supposed to represent, then it is true, but if it does not, it is false. This may sound a bit odd. In data management, we talk about data quality, but we do not talk about truth of data. In fact, truth sounds like a dangerous topic. If the data is false, then maybe we are doing something illegal.

The problem is, I do not see how we can put off thinking about this problem indefinitely. We have had almost 50 years of expanding computer infrastructures that touch on vast areas of our lives. The importance of data, and management of that data, has grown in that time. We may now even be entering a Golden Age of data. This means we are going to have to grow up in the way we do data management.

Going back to the user "requirements" for introducing a "fake" sell trade to offset a wrong buy trade in order to easily create credits and debits in the books. Immediate problems arise if we allow ourselves to think of this in terms of truth. First, a true sell trade never actually happened. Second, we are pretending that there were two buy trades. We have left the first one (the one wrongly recorded) as if it were correct. Then we have entered a second buy trade as if that were independent of the first one. Now, the users who want to book the trades with the credits and debits showing nicely are happy. But a report that shows the average number of trades per trader per day will show more trades than are real. The answer to that problem might be to filter out the fake trades. But why not tell the truth from the start?

Once you begin to tell lies, you have to do so consistently, and it always takes more effort than telling the truth. This is a pragmatic consideration, not an ethical one. We should tell the truth because it is right, not because it takes less effort.

However, I am acutely aware that pragmatism is often stated as a reason for bending data out of shape, and that I am likely to be criticized by individuals who will claim there are only requirements and design solutions, and that considerations of truth are irrelevant to data. Nevertheless, I would hope that such individuals would at least think that the debate about whether data should reflect truth is worth having. The outcome of such a debate should provide important guidance in the area of data governance.

Malcolm Chisholm, Ph.D. has over 25 years of experience in enterprise information management and data management and has worked in a wide range of sectors. He specializes in setting up and developing enterprise information management units, master data management, and business rules. His experience includes the financial, manufacturing, government, and pharmaceutical industries. He is the author of How to Build a Business Rules Engine and Managing Reference Data in Enterprise Databases and Definition in Information Management. He writes numerous articles and is a frequent presenter on these topics at industry events. Chisholm runs the websites http://www.bizrulesengine.com, http://www.refdataportal.com and http://www.data-definition.com. Chisholm is the winner of the 2011 DAMA International Achievement Award.

Comments (0)

Be the first to comment on this post using the section below.

Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.