Don’t treat accuracy as the only or even the best way to evaluate your algorithm

We have spoken a lot about performance in this playbook but have deliberately shied away from specifying how it is calculated. How well your algorithm is working is context-dependent and understanding exactly the best way to evaluate it is part of the ML process. What we do know is that in most cases a simple accuracy measure - the percentage of correct classifications - is not the right one. You will obviously collect technical metrics such as precision (how many classifications are true) and recall (how many of the true examples did we identify) or more complex measures such as F scores, area under the curve etc. But these are usually not enough to gain user buy-in or define a successful algorithm on their own (see Business Impact is more than just Accuracy - Understand your baseline for an in-depth discussion of this.)