2025 USA-NA-AIO Round 2, Problem 3, Part 13

Part 13 (5 points, coding task)

Do the following tasks:

  1. Define your model by calling model_CLIP = MyCLIP().

  2. Fix all parameter values in the ViT and Bert blocks in your model. That is, you are only allowed to train

    • Out-projection matrices in the image and text encoders.

    • Temperature.

### WRITE YOUR SOLUTION HERE ###

model_CLIP = MyCLIP()

for param in model_CLIP.model_text.parameters():
    param.requires_grad = False

for param in model_CLIP.model_image.parameters():
    param.requires_grad = False

""" END OF THIS PART """