The Perceptron algorithm applied to the training set
((\mathbf{x}_1, y_1), \ldots, (\mathbf{x}_m, y_m))
involves the following training loop:
while not converged do
    select i from {1,..., m}
    if y_i <w, x_i>  <=0 then
        w <- w + y_i x_i
    end
end
if \mathbf{w} is initialised to 0, show that \mathbf{w} can be expressed as a linear combination:
\mathbf{w} = \sum_{i=1}^m \alpha_i y_i \mathbf{x}_i,
of the training data, explaining what the value of \alpha_i will be at any stage of the algorithm.