2025 USA-NA-AIO Round 1, Problem 2, Part 2

Part 2 (10 points, non-coding task)

Define \nabla_{\mathbf{z}} \ f\left( \mathbf{z} \right) to be the gradient of function f with respect to vector/matrix \mathbf{z}.

Compute the following gradients. Reasoning is required.

  1. \nabla_{\mathbf{x}} \ y.

    The final answer should be in a matrix form.

  2. \nabla_{\mathbf{W}} \ y.

    The final answer should be in an element-wise form.

  3. \nabla_{\mathbf{b}} \ y.

    The final answer should be in a matrix form.

\color{green}{\text{### WRITE YOUR SOLUTION HERE ###}}

  1. Since \mathbf{y} \in \Bbb R^M and \mathbf{x} \in \Bbb R^N, \nabla_{\mathbf{x}} \ \mathbf{y} \in \Bbb R^{M \times N}.

We have

\begin{align*} \frac{\partial y_m}{\partial x_n} & = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial x_n} \\ & = w_{mn} . \end{align*}

Therefore,

\boxed{\nabla_{\mathbf{x}} \ \mathbf{y} = \mathbf{W} }.
  1. Since \mathbf{y} \in \Bbb R^M and \mathbf{W} \in \Bbb R^{M \times N}, \nabla_{\mathbf{W}} \ \mathbf{y} \in \Bbb R^{M \times M \times N}.

We have

\begin{align*} \frac{\partial y_m}{\partial w_{kn}} & = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial w_{kn}} \\ & = \boxed{x_n \delta_{mk}}. \end{align*}
  1. Since \mathbf{y} \in \Bbb R^M and \mathbf{b} \in \Bbb R^M, \nabla_{\mathbf{b}} \ \mathbf{y} \in \Bbb R^{M \times M}.

We have

\begin{align*} \frac{\partial y_m}{\partial w_k} & = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial b_k} \\ & = \delta_{mk} . \end{align*}

Therefore,

\boxed{\nabla_{\mathbf{b}} \ \mathbf{y} = \mathbf{I}_{M \times M}} .

\color{red}{\text{""" END OF THIS PART """}}