2024 IAIO Question 1.2

beaver-edge · August 19, 2025, 3:29am

Consider a spam email detection system where only 1% (ground truth) of emails are spam.

(a) Which evaluation metric is considered the best choice: precision," recall," or ``f1-score"?

(b) Explain the benefits of the best metric.

(c) Criticize the remaining metrics.

beaver-edge · August 19, 2025, 3:30am

(a) f1-score.

(b) A high f1-score requires both precision score and recall score are high.

A high precision score entails that there is a low probability that non-spam emails are incorrectly tested spam.

A high recall score entails that there is a low probability that spam emails are incorrectly tested non-spam.

(c) Precision: For a high precision score, it is still possible that there is a high probability that spam emails are incorrectly tested non-spam.

Recall: For a high recall score, it is still possible that there is a high probability that non-spam emails are incorrectly tested spam.

Topic		Replies	Views
2026 USAAIO Round 1 Sample problems, Problem 8 AI Olympiads	2	198	January 25, 2026
2024 IAIO Question 1.1 2024 IAIO	1	302	August 19, 2025
2026 USA-NA-AIO Round 1, Problem 1 2026 USA-NA-AIO Round 1	3	575	March 9, 2026
2026 USAAIO Round 1 Sample problems, Problem 13 AI Olympiads	1	168	January 14, 2026
2024 IAIO Question 5.2 2024 IAIO	1	95	August 19, 2025