for Information Management Blogs
JUL 13, 2012 11:27am ET

Blogroll

blog

Philosophical About Perfect Data

Print
Reprints
Email

Good thoughts and responses to last week’s column on Living With Imperfect Data, which postulated, “Sometimes it’s better to have everyone agreeing on numbers that aren’t entirely accurate than having everyone off doing their own numbers.”

People who commented or wrote to me talked about the areas that reflect their roles and how they feel responsible for data in their career.

Some sought truth in governing data entry in fields for product codes, income brackets or gender and were concerned about the high incremental costs of fixing the root causes of the last 10 or 5 percent of foundational data and whether it was worth it. That is important and nuanced work, as Jim Harris spoke to in his blog not long ago. 

But more readers were looking at the distinction between foundational data and the way it is assembled and interpreted once it has been reliably produced. Some focused on data definitions that led to the numbers chosen to steer decisions based on rules and policies. And others looked at the application of hierarchies and metadata that combine and recombine data reporting and performance measurement. 

A small but clear majority felt we are getting past the “single view” mentality as a prescription for all our data problems. Jon B wrote that there’s seldom one truth, especially in metadata, and that is just the nature of things. “Sales, operations, product, and even the customer look at things in different ways. They may 'agree' on what is in the ERP master, but will use different ways, usually Excel, to look at it the way they need to get things done. Plus things are changing in terms of customers and products so quickly a rigid data governance process may actually slow down progress.”

Wayne K. bought into Chai Lam’s view in last week’s article about stewardship being the heavy lifting of operationalizing data in the course of business, though unlike Chai, gave the job to IT. “Our goal should not be "a single version of the truth" but how do we enumerate, rationalize and manage all the different versions of the truth. I see this as a driver for new essential work for IT people to do and that's a good thing.”

Peter P took a temporal view of “truth versioning” where veracity depends on the time frame. “A person may be located at one address today, but another tomorrow. In many business applications, we need to maintain both locations in relation to the time frame when they were 'true.'"

He also took the wider view of attributes and entities as contextual states. "’Customer’ is not the entity. A customer is a role played by a person, organization or group, which is the entity ... ‘Bob’ can be a student, a donor, a supplier, an employee and even a customer all at the same time ... So how can they possibly know the "truth" about Bob across these multiple roles?
And, we all want a piece of Bob for our own purposes, don’t we?

Richard R said the needs and timeframe of the user outweigh the value of creating of data suitable for all purposes. “A LOB manager in the FMCG industry can't wait until every minor adjustment has been processed to see their sales figures (timeliness), any more than a mortgage manager would make a loan decision based solely on a current snapshot without considering the customer's history (interpretation).”

By these parameters, Richard said, lower quality data can be more useful than perfect data. “As long as the information is good enough for the recipient to make sound business decisions, taking longer to make the data more accurate actually diminishes its value rather than improving it.”

Readers span a curve of preference between empirical and utilitarian views, perhaps the way a chemist’s view of quality and purpose is different than a butcher’s. And let’s be honest; compromise is part pragmatics and part politics.

That might make a good visual to show that managing data is not a guns and butter problem and that accuracy and compromise will continue to coexist across the span of information management.

Advertisement

Comments (3)
Certainly, Sales, Operations, Product, and even the customer look at things in different ways simply because each different way of looking happens within a particular context of a particular process or function. But when multiple versions of the truth pop up within the same context, there is a problem. Additionally, problems arise when truth is asserted without context.

Still, a certain single truth remains: Underlying all the varying enterprise contexts, processes, and functions is but one foundational, holistic, naturally and fully integrated enterprise structure. Mess up that truth to a great extent with disparate "silo" data solutions, then the thinking that "as long as the data are good enough to make sound decisions ..." merely rings as a rationale for inaction to do better to improve data quality and to do it continually.

Posted by Ed J | Monday, July 16 2012 at 1:26PM ET
Jim's article shows quite well that new categories of data occur and that new rules need to be defined. Imperfect data required: Decisions are already taken when selecting the data we track, when selecting data and data structures for reports and not to forget when defining the decision fields itself. Just repeating the well-known following story: A drunk loses the keys to his house and is looking for them under a lamppost. A policeman comes over and asks what he's doing. "I'm looking for my keys" he says. The policeman states: "But there are definitely no keys here around". The drunk: "But searching with light is so much better". A free translation into Jim's topic: The more decisions extend into the future the less reports on structured 'accurate' data are really helpful. Two examples: To improve business processes it makes sense to check the free text communication of bookers, planners, dispatchers, invoice operators. To get a feeling of consumer markets the collection of customer pain points in blogs etc. is most useful. There are a lot of similar examples to get an idea about the health of suppliers or VIP customers. Rules for imperfect data required: The use of data has always a 'legal' dimension. As Jim said you cannot always wait for all data covering a decision. A decision has an optimal time window to be taken, too early or in our case too late the additional costs can be tremendous. But if based on existing structured data these data should be accurate. Other rules need to be defined to treat and use unstructured data, Big Data, or other by nature imperfect data.
Posted by Peter K | Monday, July 16 2012 at 2:30PM ET
Add Your Comments:
You must be registered to post a comment.
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Blog Archive for Jim Ericson

Next Stop’s Mine
Data Services Verticalization in 2013
Mobile is BI’s Big Stick
Seriously, What is PaaS?
Cooks, Chefs and IT

More from Jim Ericson »

Blog Index »

Where do young IT professionals (30 and under) obtain information to aid with daily role responsibilities and career development?

Trade publication websites 14%
Social media 23%
Vendor websites 4%
Vendor/community forums 7%
Newsletters 1%
Trade conferences/meetups 2%
RSS feeds 6%
Web search 44%

 

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.