2025 USA-NA-AIO Round 1, Problem 3, Part 17

Part 17 (15 points, coding task)

Construct a class called My_Log_Reg whose objects are logistic regression models.

  • Method __init__

    • Inputs

      • solver: The value must be GD or Newton. Otherwise, it raises an error message Invalid solver.

      • lr: The learning rate.

      • num_iter: The total number of iterations.

    • Attributes

      • solver

      • lr

      • num_iter

      • coef_: \mathbf{\beta} in our theoretical model. It shall be a 1-dim numpy array with shape (d,).

  • Method fit

    • Inputs

      • X: Features in a training dataset. The shape is (N_train,d).

      • y: Ground-truth labels in a training dataset. The shape is (N_train,)

    • In the body of this method

      • Use the configured solver to compute coef_.

      • Do whole-batch iteration.

      • After finishing training coef_, generate a plot about the loss function vs iteration.

        • The x-label is iter.

        • The y-label is loss.

        • The title is the optimization method: either GD or Netwon.

      • The only loop that you can use is the whole-batch iteration. Within each iteration, when you update coef_ by applying either GD or Newton, you are not allowed to use any loop.

    • Output

      • None
  • Method predict

    • Input

      • X: Features in a test dataset. The shape is (N_test,d).
    • Output

      • y_pred: Predicted labels in the test dataset. The shape is (N_test,).
  • Method score

    • Input

      • X: Features in a test dataset. The shape is (N_test,d).

      • y: Ground-truth labels in the test dataset. The shape is (N_test,).

    • Output

      • accuracy_score: The accuracy score of the prediction of y.
### WRITE YOUR SOLUTION HERE ###

class My_Log_Reg:
    def __init__(self, solver, lr, num_iter):
        self.solver = solver
        self.lr = lr
        self.num_iter = num_iter
        self.coef_ = None

    def fit(self, X, y):
        N, d = X.shape
        self.coef_ = np.zeros(d)
        loss_history = []
        loss = -np.sum(y * np.log(my_sigmoid(X @ self.coef_)) + (1 - y) * np.log(1 - my_sigmoid(X @ self.coef_)))
        loss_history.append(loss)

        if self.solver == 'GD':
            for i in range(self.num_iter):
                params_grad = X.T @ (my_sigmoid(X @ self.coef_) - y)
                self.coef_ -= self.lr * params_grad
                loss = -np.sum(y * np.log(my_sigmoid(X @ self.coef_)) + (1 - y) * np.log(1 - my_sigmoid(X @ self.coef_)))
                loss_history.append(loss)

            plt.plot(loss_history)
            plt.xlabel('iter')
            plt.ylabel('loss')
            plt.title('self.solver')
            plt.show()
        elif self.solver == 'Newton':
            for i in range(self.num_iter):
                params_grad = X.T @ (my_sigmoid(X @ self.coef_) - y)
                params_Hessian = X.T @ np.diag(my_sigmoid(X @ self.coef_) * (1 - my_sigmoid(X @ self.coef_))) @ X
                self.coef_ -= self.lr * np.linalg.inv(params_Hessian) @ params_grad
                loss = -np.sum(y * np.log(my_sigmoid(X @ self.coef_)) + (1 - y) * np.log(1 - my_sigmoid(X @ self.coef_)))
                loss_history.append(loss)

            plt.plot(loss_history)
            plt.xlabel('iter')
            plt.ylabel('loss')
            plt.title('self.solver')
            plt.show()
        else:
            raise ValueError('Invalid solver')


    def predict(self, X):
        y_pred = np.where(my_sigmoid(X @ self.coef_) >= 0.5, 1, 0)
        return y_pred

    def score(self, X, y):
        y_pred = self.predict(X)
        accuracy_score = np.mean(y_pred == y)
        return accuracy_score



""" END OF THIS PART """