USAAIO
1
Part 2 (10 points, non-coding task)
Define \nabla_{\mathbf{z}} \ f\left( \mathbf{z} \right) to be the gradient of function f with respect to vector/matrix \mathbf{z}.
Compute the following gradients. Reasoning is required.
-
\nabla_{\mathbf{x}} \ y.
The final answer should be in a matrix form.
-
\nabla_{\mathbf{W}} \ y.
The final answer should be in an element-wise form.
-
\nabla_{\mathbf{b}} \ y.
The final answer should be in a matrix form.
USAAIO
2
\color{green}{\text{### WRITE YOUR SOLUTION HERE ###}}
- Since \mathbf{y} \in \Bbb R^M and \mathbf{x} \in \Bbb R^N, \nabla_{\mathbf{x}} \ \mathbf{y} \in \Bbb R^{M \times N}.
We have
\begin{align*}
\frac{\partial y_m}{\partial x_n}
& = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial x_n} \\
& = w_{mn} .
\end{align*}
Therefore,
\boxed{\nabla_{\mathbf{x}} \ \mathbf{y} = \mathbf{W} }.
- Since \mathbf{y} \in \Bbb R^M and \mathbf{W} \in \Bbb R^{M \times N}, \nabla_{\mathbf{W}} \ \mathbf{y} \in \Bbb R^{M \times M \times N}.
We have
\begin{align*}
\frac{\partial y_m}{\partial w_{kn}}
& = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial w_{kn}} \\
& = \boxed{x_n \delta_{mk}}.
\end{align*}
- Since \mathbf{y} \in \Bbb R^M and \mathbf{b} \in \Bbb R^M, \nabla_{\mathbf{b}} \ \mathbf{y} \in \Bbb R^{M \times M}.
We have
\begin{align*}
\frac{\partial y_m}{\partial w_k}
& = \frac{\partial \left( \sum_{i=0}^{N-1} w_{mi} x_i + b_m \right)}{\partial b_k} \\
& = \delta_{mk} .
\end{align*}
Therefore,
\boxed{\nabla_{\mathbf{b}} \ \mathbf{y} = \mathbf{I}_{M \times M}} .
\color{red}{\text{""" END OF THIS PART """}}