"It's none of your business." That phrase used to mean something; but increasingly, our private concerns seem to be everybody's business. Nowhere is this more apparent than on the Internet, where we are constantly asked for everything from our ages to our salaries to our shoe sizesto sign on to sites or obtain information from merchants.

Retailers do have a legitimate reason for wanting to know about us. Demographic information enables them to provide goods and services more tailored to our individual needs and to let us know when something comes up that might match our interests. However, for many of us, that's not a compelling enough reason to risk letting personal information get into the wrong hands.

As a result, more and more people are supplying Web sites with false information ­– age, address, family size, income, etc. –­ to avoid being identified in any way.

With research showing that approximately three-quarters of Internet users don't trust Web sites to guard their privacy, it's a safe bet that there's a tremendous amount of phony personal information floating around out there. Some place the misinformation figure at more than 50 percent. Ironically, this can work against the consumer's interests when a business plans future products and services based upon a flood of inaccurate data.

The problem, then, is one of aligning the needs of business against the privacy rights of consumers. For instance, how can companies mine information important to their businesses without learning specific things about us as individuals? How can companies build data systems that we can trust to never divulge such information?

Companies and organizations are working toward these goals. The World Wide Web Consortium (W3C), a group of companies in the information technology (IT) industry, has developed the Platform for Privacy Preferences (P3P), which provides a standard format for encoding data collection and use and ultimately will allow the user to tailor preferences to his or her needs. However, P3P is in its infancy, and it does not specify any mechanisms for ensuring that sites act according to their stated policies.

The solution to enforcing site privacy policies lies in a new approach to data management based upon the principle of "Hippocratic computing." This process takes its name from part of the Oath of Hippocrates sworn to by doctors the world over: "And about whatever I may see or hear in treatment ... I will remain silent." The technology is becoming available that will enable businesses to show the same respect for privacy to anyone whose data they collect.

As a concrete example, consider Hippocratic data mining. A series of rules is built into data-collection software that automatically turns the information into "lies." For example, if I tell a Web site that I'm 38 years old and earn $60,000 a year, what actually gets entered is a randomized value obtained by adding a random value within a predetermined range to the true value. I have no reason to enter misinformation because the software at the other end is actually doing it for me!

How is this gobbledygook of any value to anyone? Using a series of mathematical guesses based partly on how the initial data was randomized, the mining program gradually reconstructs a realistic distribution of true values such as how many people were between the ages of 20 to 25 or 40 to 45. Demographic information such as this might be of great interest to a company in quest of 25-year-olds to buy its sports cars or computer games. The mining models will have some inaccuracy; however, that will be acceptable to the miners. Additionally, the privacy of those who provided personal information remains intact.

Why have we not been using Hippocratic computing all along? The fact is that in the past, most databases were used to help us with inanimate populations such as inventories, shipping schedules and rates and prices. Only with the advent of the Internet and the limitless opportunities for people to interact with each other have we seen an explosion in the collection of "animate" data, along with the potential to abuse personal information. Essentially, we're still playing catch-up with this phenomenon.

And catch up we must. Privacy is no longer an option; it has become a business imperative. Companies will have to adopt privacy policies that feature ironclad technology for two reasons: first, because consumers are insisting upon privacy, companies that are early to embrace –­ and publicize ­– such privacy guarantees as Hippocratic computing are likely to enjoy an advantage over their competitors; second, if businesses do not respond to market pull and regulate themselves, legislators will be more than happy to do it for them. We're already seeing movement toward privacy legislation in the medical and financial industries –­ two areas where people are most sensitive about the handling of personal information. In Europe, the OECD privacy guidelines are in place, and there is movement toward privacy legislation in several other countries including Canada, Australia and Japan. Clearly, proactive deployment of privacy-friendly technologies is desirable. Whether P3P is the ultimate answer or not, the W3C is committed to developing privacy standards and solutions.

Working with those standards, Hippocratic computing is an idea whose time has come. It will empower us to share personal information only on our own terms, and it will allow businesses to make decisions in a productive, non-intrusive manner. Thus, we will also be assured that the first tenet of the Hippocratic oath comes to pass: Do no harm.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access