How often do you deploy a model?
Establishing a good model for your data once is hard enough, but in practice, you will need to retrain and deploy updates to your model – probably regularly! These are necessary because:
- the data used to train your model changes in nature over time
- you discover better models as part of your development process, or
- because you need to adapt your ML models to changing regulatory requirements
Two useful phrases help to describe the way data changes are
Data drift - describes the way data changes over time (e.g. the structure of incoming data involves new fields, or changes in the previous range of values you originally trained against) perhaps because new products have been added or upstream systems stop populating a specific field.
Concept drift - means that the statistical nature of the target variables being predicted might change over time. You can think of examples such as an ML-enhanced search service needing to return very different results for “chocolates” at Valentine's day versusEaster, or a system that recognises that users’ fashions and tastes change over time, so the best items to return won’t be the same from season to season. Processes that involve human nature are likely to result in concept drift.
Measure drift over time to understand when a model’s accuracy is no longer good enough and needs to be retrained.
It’s also good practice to regularly redeploy your model, even if you haven’t improved it or noticed changes in data characteristics! This allows you to make use of new framework versions and hardware, to address security vulnerabilities through updates to components, and to be sure that when you need to deploy a fresh model, you know that your deployment processes work.
In one engagement with a client who was a leader in the travel industry, we had used data from the past five years to build a prediction model of repurchase behaviour. The model had good accuracy and was running well in production. Travel behaviours exhibited sudden and drastic change from March 2020, when the whole world reacted to the rapid spread of “SARS-Cov-2” by closing borders. The data that the model had been trained on had absolutely no pattern of what was happening in real life. We realised that continuing to use the model output would not be useful.
The team changed the model to factor in the border closures to effect on the predictions. We also incorporated a signal analyser into the model, which constantly monitored incoming data for a return to normalcy. It was changed to identify data patterns which matched the pre-covid historical data so that the model can switch off the dependency on specific Covid-related external data, when conditions return to normal. Uttam Kini Principal consltant
Equal Experts, India