Data warehousing systems present special security issues including:

  • The degree of security appropriate to summaries and aggregates of data,
  • The security appropriate for the exploration data warehouse, specifically designed for browsing and ad hoc queries, and
  • The uses and abuses of data encryption as a method of enhancing privacy.

Many data structures in the data warehouse are completely devoid of sensitive individual identities by design and, therefore, do not require protection appropriate for the most private and sensitive data. For example, when data has been aggregated into summaries by brand or region, as is often the case with data warehousing, the data no longer presents the risk of compromising the private identities of individuals. However, the data can still have value as competitive intelligence of market trends, and thus requires careful handling to keep it out of the hands of rival firms. Relaxed security does not mean a lack of commitment to security. The point is that differing levels of security requirements ought to remind us that one-size-fits-all solutions are likely to create trouble.
Another special security problem presented by data warehousing is precisely the reason why such systems exist. Data warehouses are frequently used for browsing and exploring vast reams of data ­– undirected exploration and knowledge discovery is provided by an entire class of data mining tools. The point is to find novel combinations of products and issues. Whether authentic or mythical, the example of market basket analysis whereby diapers are frequently purchased with beer is now a classic case. The father going to the convenience store for "emergency" disposable diapers and picking up a six-pack on the way out suggests a novel product placement. The point is that it is hard to say in advance what restrictions would disable such an exploratory data warehouse; therefore, the tendency is to define an unrestricted scope to the exploration. A similar consideration of undirected knowledge discovery applies to simple ad hoc access to the data warehouse. Examples where a business analyst uses end-user self-service tools such as those by Business Objects, Information Builders, Cognos or Oracle to issue queries directly against the data without intermediate application security give the end user access to all the data in the data warehouse. Given privacy and security imperatives, it may be necessary to render the data anonymous prior to unleashing such an exploratory, ad hoc process. That will create complexity where the goal is (sanctioned) cross-selling and up-selling. The identity must be removed in such a way that it can be recovered, as the purpose is often to make an offer to an individual customer.

Encryption of data has its uses, especially if the data must be transmitted over an insecure media such as the Internet. There, Secure Sockets Layer (SSL), which is an implementation of X.509 public-private key cryptography, serves well in transmitting credit card numbers, passwords and other identifying data. However, encryption is a poor method of access control. An employee, his or her manager and the human resources clerk all require access to the employee's record. Therefore, encrypting the data will not distinguish between their access levels. It is misguided to believe that if encrypting some data improves security, encrypting all the data improves security even more. Blanket, global encryption degrades performance, lessens availability and requires complex encryption key administration. Encryption is a computationally intense operation. It may not impact performance noticeably when performed for one or two data elements; but when performed arbitrarily for an entire table, the result may very well be a noticeable performance impact. It might make sense to encrypt all the data on a laptop PC that is being taken off site if the data is extremely sensitive. If the PC is lost or stolen, only encryption will guarantee that the data is not compromised. However, an even better alternative would be selective encryption and organizational steps to make sure the physical site is secure and the media containing the data is handled diligently.

As a general rule, proven security practices and solutions developed to secure networks will be appropriate and extended to protect the data warehouse. In other cases, data warehouses present special challenges and situations because the data is likely to be the target that encourages hackers to try to gain access to the system. These practices extend from organizational practices to high-technology code. The requirement for authentication implies certain behavior – on-site staff should wear their corporate identification badges and be required to sign an agreement never to share a user ID or password with anyone. Based on your enterprise's specific confidentiality rules, make selective use of new areas where technologies are still emerging. It is essential that database administrators work together with their security colleagues to define policies and implement them using the role-based access control provided with the standard relational database data control language (DCL).

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access