Policies are not everyone’s favorite topic. They often conjure up an image of heavy handed bureaucratic interference that makes getting things done more difficult and adds to the workload of already over-stretched staff. And you can add to this the criticism that policies are devised by people who have little understanding of the operational settings in which the policies must be complied with.

But just how fair is this view? I acknowledge that policies done poorly can justifiably be thought of as creating more work for no overall gain. However, I would argue that there is a much greater problem: The average enterprise does not have nearly enough policies for data management, and this famine of policies has an enormous detrimental impact on the way the average enterprise manages its data.

What Is a Policy?

For most of my career I was confused as to what exactly a policy is. For instance, I remember being told by a DBA that assigning me read access to a particular database was in itself a policy. This of course is far too low a level for a policy, but just what are policies? Fortunately, my good friend Susan Garza eventually set me straight. Her definition of a policy (slightly tweaked) is:

Policy: a high level rule that states a business behavior that is enforceable and enforced.

Essentially, a policy states what to do or what not to do. Furthermore, good policies are built as a collection of individual policy statements, where each statement is an individual high-level rule. Unfortunately, many policies are not actually written like this.

A policy does not state how to do something or what controls to put in place to make sure that something is not done. This is a somewhat controversial assertion, and to understand why it is so in data management we need to take a brief look at the sociology of the enterprise.

Data Policies, IT and Operational Environments

Policies about data management are unlikely to come out of IT. IT sees itself as an engineering undertaking that builds systems according to requirements supplied by users. Why a system would need more than a user guide about how to use it is puzzling to IT. Therefore, we cannot expect IT to develop any data management policies.

The enterprise outside of IT, collectively known as “the business” to IT is actually divided into a number of different groups. For the purposes of policies, I would like to focus on just one group – Operations - although other groups (such as the front office, Legal and Finance) are also very important.

Operations, also known as the “back office,” has to run the systems that IT has developed or which the enterprise has acquired. In terms of data management, this can be quite difficult. Executive management regards Operations as a cost center that is to be made more efficient. If Operations goes to executive management to ask for help with its challenges and problems, often – at least in the USA - executive management is generally unsympathetic and tells Operations to make the challenges and problems go away – at no cost. The result is that Operations is highly tactical and siloed. It has no choice. Further, in most enterprises, Operations is measured on quantity and timeliness, but not quality.

On top of all of this, each system that Operations has to manage is different.

Policy Implications

Because Operations is highly siloed, it is generally not possible for a policy to specify how the policy should be operationalized. Operationalization can only be done by competent personnel within each operational environment. Thus trying to write data management policies that specify how to do something is not only theoretically wrong (policies should not do this) but is practically impossible.

But the nature of Operations also helps us understand why there is a desperate need for policies. The enterprise changes every day, perhaps minimally, but there is still change. These changes impact the way Operations has to drive the systems it are responsible for. Management, as noted above, is deeply unsympathetic to this predicament, and IT cannot provide solutions in an acceptable time frame with acceptable cost (and a reasonable guarantee of not messing up). This leaves each operational environment to solve its own problems. The result is a large number of uncoordinated tactical “fixes” that accumulate over time.

For instance, suppose we have operational environments across the enterprise for onboarding customers. Let us further suppose that each such environment must select which minimum data elements must be populated to create a customer record. Two problems immediately arise.

First, if each operational environment makes its own decision about this, it will likely choose the bare minimum. After all, each environment will likely be judged on the number of customers it onboards, and every additional data element that has to be populated slows down the onboarding rate.

Second, there is no guarantee that what one environment sees as an essential data element is seen as an essential data element by another. A particular data element may be easy to populate in one operational environment, but more difficult to populate in another, so it is not considered essential in the latter. Again, this is driven by the siloed nature of Operations, and the typical ways in which its performance is measured.

Policies as the Solution

Now, suppose that our example includes a customer data policy that specifies what the minimally acceptable data elements are to establish a customer record. Each operational environment no longer decides what data elements it wants to collect to establish a customer record, and every operational environment has to collect the same set. Obviously, this is beneficial as a minimum standard of record completeness is set for the enterprise. Furthermore, systems downstream of where customer records are created get a uniform set of populated data elements, rather than a mosaic of missing data depending on where the records came from.

Of course, each operational area has to figure out how to operationalize the policy. They may need to adjust the way in which they source the customer data, how they validate it, how they normalize the content and so on. None of this is the concern of the policy itself, so there is still a lot of decision-making to be done at the operational level.

This is just one example, but a lack of data management policies typically plagues enterprises. Data governance is too recent a phenomenon compared to the decades of informal decision-making in Operations. Hence the urgent need for data governance to seize the initiative in the policy area and begin to end the policy famine in data management.