Four people your data team needs to win the model deployment relay
What can you learn about your toughest model management problems from a top track and field team? A lot, actually. You see, the solution to both – running a successful relay and quickly deploying models – is having a solid operational process.
A relay team develops a process for literally every step that happens in the transition zone (the 20-meter area where the baton handoff occurs for each stage of the race). The incoming runner knows precisely how many steps to take before extending the baton. The next runner knows when to begin accelerating and how many steps to take before blindly reaching back for the baton. It’s science and math displayed in human form.
Analytical model production efforts are subject to similar failing points in the transition areas. The result is that not a lot of model make it into production. Some get disqualified at the start, but many fall victim to poor handoffs.
Each day that an analytical model sits on the shelf is a day that is costing your organization money – either in the cost of developing an unused model or in missed opportunities that yet-to-be-implemented model might have yielded.
Deploying accurate and effective models is challenging. While many analytical models are developed, the numbers show that less than 50% of the best models make into production. That's precious few models for all of your efforts and expense.
It takes an overall strategy and close collaboration to develop a successful (read profitable) analytics program. It's like a relay race. Each person on the team has a specific role. The team members need a clear track and a clean, efficient handoff from the previous stage to next.
While model development is tough, model operations, or ModelOps, is harder and so far, it been given less attention by most organizations. The focus for most organization has been model development, less on model deployment, and even less on ongoing model maintenance. ModelOps follows the same principles as DevOps and is focused on operationalizing analytics and gives organizations the ability to develop, deploy and manage models at scale.
The right way to manage your models
To be effective at model management you need to strong team. At each stage, you need the best-suited person with the right skill set and right tools to do the job. The good news is that you don’t need a lot of people to accomplish this. Just like a relay race, the right four people can manage the complete model lifecycle. Here is the lineup:
- An analytics lead to organize and manage analytics and model development.
Their primary focus is to coordinate activities for any stage of the analytics life cycle. The analytics lead determines the critical business questions that must be answered. Top-performing analytics leads draw on the organization's deep functional expertise, strategic partnerships and core competencies for organizing analytics talent. These leaders have a network of internal partners that provide their team access to data and technology and foster collaborative development of analytics capabilities, as well as the breadth and depth of talent required for a robust program of advance analytics.
- A validation officer to test and validate models.
Analytical models must undergo rigorous testing and intense scrutiny. It's important to have a validation officer on your team to inspect and verify the results of model testing to help ensure they can answer of the business question being asked. All data scientists have been in a situation where they believe a machine learning model will do a great job of predicting actions or behaviors, but once it’s in production, it doesn’t perform as well as expected.
With the time expense of getting a model into production, this is a big concern. But in the worst case, a model performing unexpectedly poorly can cost millions of dollars in sales or in a damaged reputation. So, was the predictive model wrong in those cases? Possibly. But often it is not the model that’s wrong, but how the model was validated. Weak validation delivers over-optimistic expectations of what will happen in production. That's why you need a validation officer with analytical skills and strong business acumen.
- An application developer (and database experts) for model publishing and scoring.
At this stage of the model management cycle, your organization will have chosen a champion model, produced the score code and pushed that package to the necessary DevOps systems. The application developer along with database experts are your DevOps team. They decide among the best deployment options such as a REST API, batch jobs, in-database, etc. This is where new data streams pass over the model parameters.
- A model steward to monitor model performance.
The model steward monitors model performance in a variety of ways. This most important thing a model steward watches for is drift. Drift means that the data is no longer relevant or useful to the problem at hand. Because data is always changing, drift occurs naturally.
What is key is to ensure that it doesn’t drift too far. Model stewards are there to ensure that the model inputs look similar to those that were used in training the model. On the operational side, it’s important for a model steward watch the amount of resource consumption that is occurring including the CPU, memory, disk, and network I/O. These are signals as to how efficiently the model is running on your platform.
Other key performance indicators on the operational side are latency and throughput. Latency is the delay before a transfer of data begins following an instruction for its transfer, while throughput is the amount of data successfully moved from one place to another in a given time period. A good model steward will have a firm grasp of the analytics behind the models and an awareness of your organization's IT capabilities and resources and be able to determine when it’s time to retrain or retire a model.
What’s the payoff?
Why is this focus on model management relevant to your organization? Here are some sobering numbers. Less than 50% of the best models are ever deployed. Of those remaining models only 10% are deployed in under three months and 40% take more than seven months to deploy.
Want to improve those odds? The first step is getting your ModelOps team and process in place. Then you will want to conduct a model health check and consider model management software tools.