2024 IAIO Question 1.2

Consider a spam email detection system where only 1% (ground truth) of emails are spam.

(a) Which evaluation metric is considered the best choice: precision," recall," or ``f1-score"?

(b) Explain the benefits of the best metric.

(c) Criticize the remaining metrics.

(a) f1-score.

(b) A high f1-score requires both precision score and recall score are high.

A high precision score entails that there is a low probability that non-spam emails are incorrectly tested spam.

A high recall score entails that there is a low probability that spam emails are incorrectly tested non-spam.

(c) Precision: For a high precision score, it is still possible that there is a high probability that spam emails are incorrectly tested non-spam.

Recall: For a high recall score, it is still possible that there is a high probability that non-spam emails are incorrectly tested spam.

1 Like