Problem 3.
Part 3.1.
Consider two regularized linear regression models trained on the same dataset:
Model A uses l1-regularization. Model B uses l2-regularization. Both use the
same regularization strength λ > 0.
Which statement is most likely true?
A. Models A and B have the same number of non-zero weights.
B. Model A has fewer non-zero weights than Model B.
C. Model B has fewer non-zero weights than Model A.
D. Neither model can produce zero weights.
E. There is no way to know which model has fewer non-zero weights.
Part 3.2.
In supervised machine learning, as you increase the complexity of a model (for example, by increasing the degree of a polynomial in a regression model), which of the
following best describes the typical behavior of the error components due to bias and
variance?
A. Both Bias and Variance increase.
B. Both Bias and Variance decrease.
C. Bias and Variance remain constant regardless of complexity.
D. Bias decreases and Variance increases.
E. Bias increases and Variance decreases.
1 Like
Part 3.1.
B. Model A has fewer non-zero weights than Model B.
Part 3.2
D. Bias decreases and Variance increases.
1 Like
Reasoning:
Part 3.1
L1 regularization directly subtracts the sum of the coefficients (of the weights) from weights as a penalty. This encourages more weights to be 0. However, L2 regularization subtracts the squares of the coefficients. This means weights will often be small, but not 0. Think of it as a square versus a circle: The square has sharp edges “zeroes” while the circle has many tiny edges. (The graphs of the regularizations agree.)
Based off of the previous logic, the only correct option left is (B.), because less zero weights in the L2 regularized-model (Model B) means more non-zero weights in Model B which means Model A has fewer non-zero weights than Model B.
Part 3.2
As you increase the degree of a polynomial, it can better fit the sample set (decreased Bias), but small changes in the input can fluctuate greatly (increased Variance). This means the answer would be (D.)
For L1 vs L2 normalization, this video is very helpful: https://www.youtube.com/watch?v=OLl2nzOeQ68
2 Likes