Requirements for Advanced Analytics
Gathering requirements for advanced analytics, data mining and predictive analytics is an interesting topic. It is interesting not because interviewing techniques are different for advanced analytics nor because the various techniques for helping technical teams elicit real requirements from businesspeople are different. The “how” of requirements is the same for advanced analytics as it is for anything else. It is interesting because what you need to ask about - the “what” – is unique.
Advanced analytics are valuable for one reason and one reason only – because they allow you to improve the quality of decision-making in your organization. As I noted in a previous column (“Where to Begin with Predictive Analytics”), the best kinds of decisions to improve with predictive analytics are operational decisions - decisions about a single customer or transaction.
The requirements challenge for most organizations is that they don’t really know what these decisions are. They have never attempted to document these decisions with any degree of precision, if at all. They don’t understand how these decisions fit within a broader understanding of their businesses. Yet, to be successful with predictive analytics, the gathering of requirements in terms of these decisions is critical.
First, it is worth considering the kinds of decisions we are discussing. These kinds of decisions can be categorized into a decision taxonomy. Such a taxonomy includes a number of types of decisions including risk decisions (how risky is this loan and how should we price it), fraud decisions (how likely is this claim to be fraudulent and what should we do about it) and opportunity decisions (how can we maximize the value of this customer interaction).
Second, we must consider the documentation of decisions. What, exactly, do we need to know about these decisions if we are to build the right analytics to improve them? Experience suggests that we must document a question and answer pair for each decision. Understanding what question needs to be answered to make the decision, specifying this explicitly and documenting what possible answers exist for the question is critical. Decisions may involve selecting the best or most suitable action from a set of possible actions, such as determining the best offer from a set of available offers, calculating the right interest rate within the range of allowed interest rate values or deciding how best to handle a claim. To manage decisions effectively we must understand the options from which we are selecting.
To be able to effectively prioritize and manage these decisions we need to link them to the business. We should understand the organizations that decide how they should be made as well as the organizations that make them or are impacted by how they are made. We can model the business objectives and key metrics that they affect so we can tell good decisions from poor ones. The processes that need the decisions and systems that implement them will also matter as we develop analytic solutions. This business context helps direct analytic effort and constrains it to solutions we will actually be able to use.
For these decisions it is also worth documenting a number of decision characteristics including volume, timeliness, consistency over time, the difference between a good decision and a bad one and the time it takes to see the value of a specific decision.
These top level decisions, however, are rarely precise enough to effectively target data mining and predictive analytics efforts. To effectively apply advanced analytics one must decompose these decisions into more granular, more closely scoped decisions. In general, a decision can be thought of as being dependent on a set of other, smaller precursor decisions. The decision as to how to process a claim, for instance, is dependent on the decisions of whether the claim is complete, valid, potentially fraudulent, subject to subrogation with other insurance companies and much more. The dependency of a decision on other decisions gives more clarity to the decision-making involved.
Decisions are not only dependent on other decisions, however. Decisions are dependent on the availability of both internal and external information sources. They are also dependent on know-how – everything from expert knowledge to regulations and policies to analytic insight in the form of data mining results and predictive analytic models. Decomposing decision-making in this way as part of our requirements process makes it clear how decisions are made and what is required in terms of analytic insight and data. Only then can more advanced analytics be specified accurately and developed and deployed effectively.
I have worked with a number of teams developing predictive analytics and it is clear that developing these kinds of models of decision-making dramatically improve the “requirements” piece of an advanced analytic project. By replacing a vague “define the business problem” with a more precise “define the decision dependency network and show where in that hierarchy the analytic insight will be applied” makes for more successful advanced analytics projects.