The Perceptron algorithm applied to the training set
((\mathbf{x}_1, y_1), \ldots, (\mathbf{x}_m, y_m))
involves the following training loop:
while not converged do
select i from {1,..., m}
if y_i <w, x_i> <=0 then
w <- w + y_i x_i
end
end
if \mathbf{w} is initialised to 0, show that \mathbf{w} can be expressed as a linear combination:
\mathbf{w} = \sum_{i=1}^m \alpha_i y_i \mathbf{x}_i,
of the training data, explaining what the value of \alpha_i will be at any stage of the algorithm.