2025 USA-NA-AIO Round 2, Problem 3, Part 19

USAAIO · May 14, 2025, 10:59pm

Part 19 (5 points, non-coding task)

Lemmas 1 and 2 jointly imply the theorem above. Please use the result in this theorem to explain why it is reasonable to use the cosine function to measure similarity of two embedding vectors and why the latent space needs to be high dimensional (such as 512, 768, 1024).

USAAIO · May 14, 2025, 10:59pm

\color{green}{\text{### WRITE YOUR SOLUTION HERE ###}}

The theorem states that in a high dimensional space, almost all pairs of vectors are orthogonal (independent), except very few that are in the same direction.

This is exactly what we want in matching images and texts. For instance, suppose there are 30k pairs of iamges and texts. For each image embedding vector, we want it to be aligned with only one text embedding vector, but orthogonal to other 30k-1 text embedding vectors. This is guaranteed by the above theorem.

Recall that a key condition of the above theorem is that the dimension must be high. Therefore, in image and text embeddings, the embedded vectors must be high dimensional.

\color{red}{\text{""" END OF THIS PART """}}

Topic		Replies	Views
2025 USA-NA-AIO Round 2, Problem 3, Part 17 2025 USA-NA-AIO Round 2	1	52	May 14, 2025
2025 USA-NA-AIO Round 2, Problem 3, Part 5 2025 USA-NA-AIO Round 2	1	28	May 14, 2025
2025 USA-NA-AIO Round 2, Problem 3, Part 15 2025 USA-NA-AIO Round 2	1	54	May 14, 2025
2025 USA-NA-AIO Round 2, Problem 3, Part 3 2025 USA-NA-AIO Round 2	1	25	May 14, 2025
2025 USA-NA-AIO Round 2, Problem 3, Part 18 2025 USA-NA-AIO Round 2	1	52	May 14, 2025

2025 USA-NA-AIO Round 2, Problem 3, Part 19

Part 19 (5 points, non-coding task)

Related topics