The zeroSR1 package is based on a proximal quasi-Newton algorithm to solve
where is a smooth convex function and is a (possibly non-smooth, and possibly infinite) convex function such that the
the proximity operator is easy to compute,
the proximity operator is separable in the components of the variable, and
the proximity operator is piecewise linear.
Exploiting the nature of , we show in 'A quasi-Newton proximal splitting method’ (Becker, Fadili; NIPS 2012) that one can also compute the proximity operator of in a scaled norm:
where is a diagonal matrix, is a vector so that is a rank-1 matrix, and is .
Because we can efficiently solve for the scaled prox, it opens up the possibility of a quasi-Newton method. The SR1 update is a rank-1 update, and by using a 0-memory version, the updates to the inverse Hessian are in exactly the form of .
This means that for the same cost as a proximal gradient method (or an accelerated one, like FISTA), we can incorporate second order information, and the method converges very quickly.
The non-smooth term can be infinite valued; for example, it may be an indicator function of a set. The indicator function of a set is denoted
Equivalently, we are enforcing the constraint in the optimization problem.
We can solve the scaled prox of the following in time (compared to time for the regular prox) for inputs of dimension :
Function | mathematical representation |
l1 norm | |
non-negativity constraints | , |
l1 norm and non-negativity | |
box constraints | , |
norm ball | , |
hinge loss |
We have put a Matlab/Octave implementation on github, under the BSD 3-clause license. If you are interested in contributing a version in python or R, we will be glad to assist.
UPDATE March 2015: Andres Asensio Ramos has a python version on github.