Part 10 (5 points, non-coding task)
So far, we have finished preprocessing our dataset. We use this dataset for the purpose of doing binary classification.
In all remaining parts in this problem, we will train a logistic regression model with our preprocessed data and test its performance.
Let all training samples be indexed as 0, 1, \cdots , N-1. For the $n$th sample, denote by \mathbf{x}^{(n)} \in \Bbb R^{d \times 1} a column vector of all features and y^{(n)} \in \left\{ 0, 1 \right\} its ground-truth label.
In the logistic regression, denote by \mathbf{\beta} \in \Bbb R^{d \times 1} the learnable parameters.
Thus, our predicted label is determined according to:
where
is the sigmoid function.
Do the following task in this part.
To train our model, we need to solve the following optimization problem:
Write down the loss function L \left(\mathbf{\beta} \right) in the following form (reasoning is not required):