2025 USA-NA-AIO Round 2, Problem 1, Part 11

Part 11 (5 points, non-coding task)

Answer the following free-response question.

  • In each epoch, while iterating over each mini-batch of dataset_PDE, why do we consider the entire data points in dataset_IC and dataset_BC, rather than also a mini-batch in these two datasets?

To be specific, recall that the mini-batch size of dataset_PDE is 32. The sizes of dataset_IC and dataset_BC are 101 and 202, respectively.

Then in each iteration, the number of data points that we use to compute the total loss value is 32 + 101 + 202 = 335.

Suppose we also do mini-batch on the IC and BC datasets with the same mini-batch size, say, 32. Then in each iteration, the number of data points that we use is 32 + 32 + 32 = 96.

We adopt the former approach, not the latter approach. You need to explain why.

\color{green}{\text{### WRITE YOUR SOLUTION HERE ###}}

In training, we always need to ensure that all IC and BC constraints are satisfied. Therefore, we enforce these constraints all the time.

\color{red}{\text{""" END OF THIS PART """}}