4 operating models to consider when setting up a data science team
‘Technology drives business’ is very much applicable to the case of how data science is being evaluated for adoption by many enterprises. Business and IT teams are working towards determining the value that machine learning can deliver to their various business scenarios. Many times it starts with proof of concepts in automating manual processes such as document classification, improving accuracy like determining the right target customer in campaign management.
Data science solutions evolve as the team understands the data and it patterns. As part of solution development, the insights gathered are discussed with business and updates to models are done in quicker turnaround. These require a higher understanding of business goals.
Now we have citizen data scientists who can see value from data science without in-depth algorithm skills. Auto ML platforms such as H2O, DataRobot enable citizen data scientists. But data is critical for data science project success and data provisioning is dependent on the IT team. Data science projects cannot be considered as part of regular project life cycle.
A well-defined process is required to bring business and IT together. This will ensure that the data science gets across multiple business units and becomes a mainstream technology adoption.
There are five key aspects to be in place for building and running data science solutions. They are:
1. Use case definition & validation: Defining clearly the problem statement, mapping the data needed and methods for validating the results.
2. Pattern mining & model building: Performing data exploration, insights generation, pattern identification and building models like predictive.
3. Model management: Defining the process for deploying models and monitoring results.
4. Data Management: Provisioning the required data from the business applications and making it accessible for exploration.
5. Standards, Processes and Tools: Supporting products and toolsets, defining the best practices and setting up processes for measuring use case business value.
How these five key aspects are set up and managed defines different operating models. We can broadly classify them into four working models
Model 1: IT driven
In this model the business is responsible for use case definition and validation of the outputs from the project. The IT team is responsible for getting the right data, model building, model management and standards. The IT services are shared and each business unit can have its own business SME to define the use case.
Model 2: Business driven
In this model the business is responsible for use case definition and as well model building. Data provisioning and model management is delivered by the IT. The standards and process can be defined by the individual business, as well certain common standards can be defined by the IT.
Model 3: Data Science Hub & Spoke
In this model a separate common data science unit is set up with the core responsibility to manage models and standards. The business focusses on the use cases and model building. The IT is responsible for the data provisioning.
Model 4: Data Science Central
This is a centralized model in which all model building, management and standards is done by a common separate Data Science unit. The business is responsible for use case definition and IT is responsible for data provisioning. This is the best optimized model ensuring best practice leverage across the organization and a well-defined focussed unit that ensures growth in data science adoption.
Figure 1: Mapping of the operating model with the responsibilities
The following questions can help in deciding an operating model
- Who owns the data science projects? (Business or IT)
- Do you want a shared services data science team managed by IT? (Yes or No)
- Do you want to set up a separate data science unit? (Yes or No)
- Do you want flexibility for the business to build their data science skillset? (Yes or No)
- Do you have global operations with local data variations? (Yes or No)
- Do you have a well-defined data environment setup? (Yes or No)
The objective is to look at all five key aspects of a data science solution in choosing an operating model and ensure that the model chosen delivers value to business at an optimal cost.