{"id":1702,"date":"2019-11-12T22:42:34","date_gmt":"2019-11-12T20:42:34","guid":{"rendered":"http:\/\/beeeye.com\/?p=1702"},"modified":"2019-11-12T22:42:34","modified_gmt":"2019-11-12T20:42:34","slug":"machine-learning-model-deployment-lessons","status":"publish","type":"post","link":"http:\/\/beeeye.com\/machine-learning-model-deployment-lessons\/","title":{"rendered":"Machine Learning Model Deployment Lessons"},"content":{"rendered":"
Working on a highly accurate machine learning model is just the first part of the complex process of putting a model into the production system. Model deployment, and specifically deploying a machine learning model can be a highly challenging task which becomes even harder when you are dealing with credit risk models.<\/span><\/p>\n It\u2019s extremely interesting to learn from the vast experience of others in this field and to take into account their insights when deciding on your approach toward model deployment. In this blog post we\u2019ll go through some of the lessons published in the last <\/span>KDD<\/span><\/a> conference in a paper by Pablo Estevez, Themistoklis Mavridis and Lucas Bernardi. References to this paper appear in numerous sources. One which resonated nicely with what we do is <\/span>here<\/span><\/a> in the <\/span>blog of Adrian Colyer<\/span><\/a>.<\/span><\/p>\n The paper focuses on the experience in deploying machine learning models by <\/span>booking.com<\/span><\/a>. Based on their experience in deploying real life machine learning models they try to draw some insights and conclusions which can benefit others in achieving good results when deploying (and developing) their own models. We found this paper and blog post highly relevant as the challenges that the team in booking.com encountered, are very common to other teams as we see daily with customers we work with. It doesn\u2019t even have to be in the same industry. Some challenges in machine learning model development are so fundamental that they are prevalent in almost any use case.<\/span><\/p>\n In addition, gaining experience with many model development and deployment projects provides much more substance to team\u2019s expectations from any technology or platform they may be using.<\/span><\/p>\n One of the first things that you encounter in that paper is the fact that they are managing 150 models in production at the same time. Of course not all models handle the same capacities of predictions, however the sheer number of concurrent models to manage is impressive. With our customers we\u2019ve found that the range of active models in the credit risk area can range anywhere between around 10 to a little over 100. Unlike any other industry, managing models in the financial world requires some substantial collateral work that comes with each model:<\/span><\/p>\n Thinking about managing this amount of active models, together with their collateral calls for a holistic approach which encompasses the entire model life cycle at the same place. This is exactly where an <\/span>end to end solution<\/span><\/a> shines. The ability of the user to create small incremental change in his modelling project and immediately assess the impact in the various facets of the model allows for enhanced agility in the entire process. Some examples for these incremental changes:<\/span><\/p>\n The paper presents some strong evidence that the overall business impact of the machine learning models is strong. This serves as another evidence that embracing a strong machine learning strategy can lead to benefits on the bottom line. We\u2019ve <\/span>discussed this<\/span><\/a> earlier in this blog and we already see strong transition in the financial world and specifically in credit risk toward fully ML based models or in some cases a hybrid approach.<\/span><\/p>\n When measuring the impact of a model to assess its ROI, we often see customers measuring only the final predictive power of the model. While this is a crucial part of the ROI math, it\u2019s important to keep in mind the entire ecosystem that a model lives in. This includes model development, data pipeline maintenance, performance, team collaboration, deployment, monitoring, documentation\u2026 the list goes on and on. We believe that by taking a <\/span>holistic approach<\/span><\/a> toward the entire life cycle of a model, the ROI of the entire modelling approach will reach new heights. This approach would make sure that your modelling experts will keep themselves busy on the modeling aspects and less on the surrounding infrastructure which need to be in place. That is: \u201cMore risk modelling, less software engineering\u201d.<\/span><\/p>\n The paper discusses the importance of early monitoring of the model performance. It mentions a few challenges that exist in their industry which also exist in the financial world. Specifically, the latency of getting feedback to a model prediction. This is the time that passes between the time the production model provided the prediction and then time the predicted event happened. In our case it would be the time between a loan was provided based on our prediction, and the final outcome of this loan (defaulted or not). In our solution we put great emphasis on closely monitoring the average prediction, skews in data (compared to training) and overall usage of the model in production (through the API of course \u2026). We provide our customers a special API which allows them to report feedback data (this is the result of a prediction) into the system so that the actual GINI (or AUC) of the model is monitored in production.<\/span><\/p>\n150 Successful Machine Learning Models<\/span><\/h3>\n
Managing a portfolio of models in production<\/span><\/h3>\n
\n
\n
Business impact from machine learning models<\/span><\/h3>\n
Monitor model performance from the very beginning<\/span><\/h3>\n
Conclusion<\/span><\/h3>\n