User Trust and Engagement

A common pitfall when surfacing a machine learning score or algorithmic insight is that end users don’t understand or trust the new data points. This can lead to them ignoring the insight, no matter how accurate or useful it is.

This usually happens when ML is conducted primarily by data scientists in isolation from users and stakeholders, and can be avoided by:
  • Engaging with users from the start - understand what problem they expect the model to solve for them and use that to frame initial investigation and analysis
  • Demo and explain your model results to users as part of your iterative model development - take them on the journey with you.
  • Focus on explainability - this may be of the model itself. our users may want feedback on how it's arrived at its decision (e.g. surfacing the values of the most important features used to provide a recommendation), or it may be guiding your users on how to take action on the end result (e.g. talking through how to threshold against a credit risk score)
  • Users will prefer concrete domain based values over abstract scores or data points, so feed this consideration into your algorithmic selection.
  • Give access to model monitoring and metrics (link here) once you are in production - this will help maintain user trust if they wish to check in on model health if they have any concerns.
  • Provide a feedback mechanism - ideally available directly alongside the model result. This allows the user to confirm good results and raise suspicious ones, and can be a great source of labelling data. Knowing their actions can have a direct impact on the model provides trust and empowerment.

Experience report

We had a project tasked with using machine learning to find fraudulent repayment claims, which were being investigated manually inside an application used by case workers. The data science team initially understood the problem to be one of helping the case workers know which claims were fraud, and in isolation developed a model that surfaced an score of 0 - 100 overall likelihood of fraud.
The users didn’t engage with this score as they weren’t clear about how it was being derived, and they still had to carry out the investigation to confirm the fraud. It was seldom used.
A second iteration was developed that provided a score on the bank account involved in the repayment instead of an overall indicator. This had much higher user engagement because it indicated a jumping off point for investigation and action to be taken.
Users were engaged throughout development of the second iteration, and encouraged to bring it into their own analytical dashboards instead of having it forced into the case working application. Additionally, whenever a bank account score was surfaced, it was accompanied by the values of all features used to derive it. The users found this data just as useful as the score itself for their investigations.
Shital Desai Product owner
Equal Experts, UK