Continue in 2 seconds

Mixing Privacy and Business Intelligence

  • September 01 2001, 1:00am EDT

Privacy is an issue that IT professionals would like to ignore. They usually feel that privacy is a real issue, but it is an issue that can complicate their professional lives in unexpected ways. Whether you are part of corporate IT or an IT vendor, privacy is a huge issue for you, not to be ignored.

At a seminar covering the Impacts of Privacy on Business in Denver, Christine Varney and her colleagues at Hogan & Hartson L.L.P. painted the changing legal landscape of privacy in vivid (and sometimes chilling) colors. Recent legislation has impacted database restrictions for financial service firms,1 health related firms,2 and commercial Web sites dealing with children.3 Combined with the legal activities of state governments and private class action suits, privacy is currently a lively area.

What is privacy? The term is loaded with emotion and blurred by culture; however, the concept is simple – control over personal information.

The majority of people in the United States are pragmatists toward their privacy. They are concerned about their privacy with many fearing the theft of their identity; however, they are willing to share personal data with a business if they trust the firm and perceive a reasonable benefit in return.

In Europe, people are more conservative, believing that their privacy is a basic right not to be violated by any company or government. The European Union (EU) has restrictions on the collection and use of personal data within Europe and on its transfer to other countries. The countries receiving personal data from Europe must adequately protect that data. Unfortunately, the EU has judged the United States as having insufficient protection; therefore, business data transferred from Europe requires special arrangements.

Five Elements of a Privacy Policy

  1. Notice regarding how your personal information will be used.
  2. Choice to opt-in (permit use) and to opt-out (deny further use).
  3. Security to protect personal data from unauthorized use.
  4. Access to one's personal data to review and ensure accuracy.
  5. Enforcement of these policies for handling personal data.

What are the implications for business intelligence? Imagine that you have just implemented a new CRM application within the corporate portal. Now everyone is empowered to slice and dice the customer database. Is all well in IT land?
If certain data can uniquely identify a person, it must be specially treated as personally identifiable data (PID). Information containing a Social Security number obviously constitutes PID. However, the combination of data items that will result in PID is not clear. For instance, knowing the ZIP code and date of birth will often uniquely identify a person in today's huge marketing databases. There are no precise criteria for PID, with the legal determination based on a statistician certifying that there is or is not sufficient probability that a person can be uniquely identified.

For BI, the complementary situation is equally important. The procedures for "de-identifying" data are critical in warehouse environments where aggregations are the basis for various embedded analytics. Reliable procedures for de-identifying data are not obvious. Further, the issue of data destruction takes on unusual complexity in the warehouse where historical context must be maintained while the underlying PID should be destroyed for legal reasons.

In the early days of data warehousing, there was no privacy issue since the data was highly aggregated, stripping it of personal identification. With the trend toward transaction-level data in real time, most warehouses now contain some complex mixture of PID.

For instance, one large retail firm boasted that its warehouse is loaded with the sales transaction by the time their customer leaves the parking lot. With many e-commerce systems, the privacy situation is more tenuous. Using Internet marketing companies such as DoubleClick, Web site visit data can be correlated with past visits and to visits at hundreds of other Web sites. If a person provided PID at any of these sites, that person can now be uniquely identified. Further, content providers can enhance customer data with demographics and segmentation that could easily result in PID.

BI tools have become powerful and universally available. From a privacy perspective, drill-down capabilities now have an expanded meaning. What limitations do your BI tools place on PID aggregations where sample size becomes small? What about data mining? Pattern detection and clustering may not violate privacy restrictions. However, reporting at the atomic level based on those patterns or clusters may result in interesting PID.

Unfortunately, there is little meta data support for PID restrictions in most generic BI products. The vendors should assume leadership in understanding and tracking privacy restrictions. Then, they should engineer their products to support those trends, facilitating compliance for IT groups.

Do you feel comfortable with your BI systems' handling of data that identifies specific individuals, whether they be customers, employees or other persons?

BI Bottom Line

  1. Your company (and possibly you personally) may be liable for its procedures in handling PID. Seek legal advice about the precise areas of liability. Then, create a privacy policy that is widely posted for employees, customers and business partners. Who has a need to know for PID and for what purpose? When is data legally considered PID? What are the valid methods for de-identifying and destroying PID within your systems?
  2. Are your BI systems compliant with the Gramm-Leach-Biliey Act? By July 1, 2001, your firm should have been compliant with this act if it provides financial services, broadly defined. If you are not sure whether or not your firm is compliant, now is the time to make sure. Seek legal advice even if your company is not normally considered a financial services firm.
  3. 3. IT vendors should seize this opportunity to make their products "privacy-aware" so that their customers can better manage PID and its special restrictions. Adequate meta data support that drives privacy restrictions throughout the architecture is required.


1. Gramm-Leach-Biliey Financial Services Modernization Act of 1999 (GLB). For further information see or

2. Health Insurance Portability and Accountability Act of 1996 (HIPPA). For further information see:

3. Children's Online Privacy Protection Act of 1998 (COPPA). For further information see: http:// or

4. Taken from slides by Christine Varney of Hogan & Hartson L.L.P., June 2001.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access