Machine Learning Model Deployment Lessons

Model deployment

Working on a highly accurate machine learning model is just the first part of the complex process of putting a model into the production system. Model deployment, and specifically deploying a machine learning model can be a highly challenging task which becomes even harder when you are dealing with credit risk models.

It’s extremely interesting to learn from the vast experience of others in this field and to take into account their insights when deciding on your approach toward model deployment. In this blog post we’ll go through some of the lessons published in the last KDD conference in a paper by Pablo Estevez, Themistoklis Mavridis and Lucas Bernardi. References to this paper appear in numerous sources. One which resonated nicely with what we do is here in the blog of Adrian Colyer.

150 Successful Machine Learning Models

The paper focuses on the experience in deploying machine learning models by booking.com. Based on their experience in deploying real life machine learning models they try to draw some insights and conclusions which can benefit others in achieving good results when deploying (and developing) their own models. We found this paper and blog post highly relevant as the challenges that the team in booking.com encountered, are very common to other teams as we see daily with customers we work with. It doesn’t even have to be in the same industry. Some challenges in machine learning model development are so fundamental that they are prevalent in almost any use case.

In addition, gaining experience with many model development and deployment projects provides much more substance to team’s expectations from any technology or platform they may be using.

Managing a portfolio of models in production

One of the first things that you encounter in that paper is the fact that they are managing 150 models in production at the same time. Of course not all models handle the same capacities of predictions, however the sheer number of concurrent models to manage is impressive. With our customers we’ve found that the range of active models in the credit risk area can range anywhere between around 10 to a little over 100. Unlike any other industry, managing models in the financial world requires some substantial collateral work that comes with each model:

Audit trail of model development
Approvals by different functions
Up to date documentation
Monitoring and model usage history, and so on.

Thinking about managing this amount of active models, together with their collateral calls for a holistic approach which encompasses the entire model life cycle at the same place. This is exactly where an end to end solution shines. The ability of the user to create small incremental change in his modelling project and immediately assess the impact in the various facets of the model allows for enhanced agility in the entire process. Some examples for these incremental changes:

Checking the value of a new data source to a production model
Removing a field from the model and measuring the impact
Applying a new function on a feature (or set of features)
Changing an algorithm
Tweaking hyper parameters
Testing the model with an ad-hoc data set

Business impact from machine learning models

The paper presents some strong evidence that the overall business impact of the machine learning models is strong. This serves as another evidence that embracing a strong machine learning strategy can lead to benefits on the bottom line. We’ve discussed this earlier in this blog and we already see strong transition in the financial world and specifically in credit risk toward fully ML based models or in some cases a hybrid approach.

When measuring the impact of a model to assess its ROI, we often see customers measuring only the final predictive power of the model. While this is a crucial part of the ROI math, it’s important to keep in mind the entire ecosystem that a model lives in. This includes model development, data pipeline maintenance, performance, team collaboration, deployment, monitoring, documentation… the list goes on and on. We believe that by taking a holistic approach toward the entire life cycle of a model, the ROI of the entire modelling approach will reach new heights. This approach would make sure that your modelling experts will keep themselves busy on the modeling aspects and less on the surrounding infrastructure which need to be in place. That is: “More risk modelling, less software engineering”.

Monitor model performance from the very beginning

The paper discusses the importance of early monitoring of the model performance. It mentions a few challenges that exist in their industry which also exist in the financial world. Specifically, the latency of getting feedback to a model prediction. This is the time that passes between the time the production model provided the prediction and then time the predicted event happened. In our case it would be the time between a loan was provided based on our prediction, and the final outcome of this loan (defaulted or not). In our solution we put great emphasis on closely monitoring the average prediction, skews in data (compared to training) and overall usage of the model in production (through the API of course …). We provide our customers a special API which allows them to report feedback data (this is the result of a prediction) into the system so that the actual GINI (or AUC) of the model is monitored in production.

Conclusion

Real world papers like the one discussed here serve as a great source for hard-earned real-life insights on managing large number of models in production. We found many of the points in this paper to be in great sync with our product strategy. We invite you to try our platform and decide decide for yourself if our approach is right for your use case.