I recently attended a lecture by Professor Jack Balkin of Yale Law School. He was speaking on "Big Data Law and Policy" - one of a small number of topics that might get me into a legal lecture.

I'll begin by saying I enjoyed the lecture; he used Asimov's Robot series of novels to help explain that the important points about algorithms are the people who implement them and the ways they affect people – everything else is far less significant.

He also pointed out that in the popular culture, there's a pronounced tendency for people to behave as if there are tiny little people inside our machines, pulling levers, and occasionally going rogue and wreaking havoc. This is, of course, nonsensical – but it's nevertheless an accurate observation about the feelings some of us have regarding autonomous, algorithmic processes, whether it's self-driving cars or a market-segmentation process in a retail company.

After the primary lecture, Professor Frank Pasquale of the University of Maryland School of Law commented. He provided some historical context about the current regulatory environment and how that may be applied going forward as the legal landscape regarding our analytic processing evolves.

Core observations made by Professor Balkin were that the entity that collects data should be considered to have a fiduciary responsibility toward the provider of the data. This responsibility is twofold: a "duty of care" and a "duty of loyalty" (expressed more colloquially as "you can't act like a con man"). He notes that the relationship is typically asymmetric – as a consumer, one has little choice to deny the request for information from some quarters, and also has little to no insight about how that information is handled.

He also postulated that a third party receiving data is not in a fiduciary  role, but has an obligation to avoid becoming an "Algorithmic Nuisance" - to manage the information in a way that is consistent with the public good.

The commentary by Professor Pasquale included a suggestion that we may need to develop something of a "Noah's Ark of Algorithms" wherein the evolution of algorithms can be tracked with attribution in an auditable fashion.

I've had a little time to consider the implications of this, and I've come to the conclusion that they will be profound in the coming years.

I will say that the question of attributable algorithms gives me pause, but doesn't seem inconsistent with existing practice. After all, contributions to GitHub are traceable to a given login. Transparency is, in general, a good thing. It's true that frequently the most important question is "how did we get here?".

However, the point was made that intent isn't necessary to cause harm, and that harm may be detected when individuals are segmented based on risk of traits (rather than the traits themselves), which may frequently result in higher prices or even being priced out of markets. The manipulation of behavior to avoid being placed into specific categories, and thus being made "equivalent to" other members of that population, may be regarded as harm.

This is the sort of thing that comes under the heading of "algorithmic nuisance" - the algorithms themselves may in fact be perfectly correct, but the application of them may have socially undesirable results. The point was also made that some algorithms have inherent cultural or social biases, or will reflect those biases as contained in a data set, in such a way as to produce biased results (for example, a "beauty pageant" algorithm was shown to have a bias toward those of European ethnicities).

Now we're getting into some fairly uncomfortable territory for those of us who work in data management. What's being proposed is that we should hold people responsible for the results of the application of algorithms. This is certainly necessary at some point, but it very clearly means that the potential for undesirable repercussions will be higher in the future than we have been accustomed to. Being "right" isn't enough, we must also be "responsible" - and most of us would agree that's true.

Based on this, I propose that you should begin now to assess the results on individuals of the work that you do. Make preparation to present in court how your market segmentation works, how your actuarial calculations work – and expect to be asked why you thought the outcome was acceptable, beyond the fact that it was accurate. Bear in mind that it's not the existence of the information, but the action taken on that information that counts. Record not just what you plan to do, but why you plan to do it – and then ask yourself if it will sound good in court.

One other thing: In the security context (my personal arena), this doctrine may also see individuals being held responsible for the behavior of their devices that have been infected with malware or captured as part of a botnet. After all, you should have applied your patches in a timely fashion, you should have changed the default password... While current "blacklisting" mechanisms can isolate a business, I look for extensions of the thought processes here to lead to a formal legal structure in which bad network citizens have their devices fixed for them, and at their own expense, similar to getting an abandoned car towed.

Really, this isn't anything different than what we should have been doing all along. On the other hand, the consequences of a miss are likely to become much more expensive.

(About the author: Richard Beals is the information security practice lead within the Business Analytics and Strategy group at ICC, headquartered in Columbus, Ohio. He works primarily at the intersection of privacy, security, and data management.)

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access