Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths.

Authors

  • Jan Klosa
    Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany.
  • Noah Simon
    Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.
  • Pål Olof Westermark
    Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany.
  • Volkmar Liebscher
    Institute of Mathematics and Computer Science, University of Greifswald, 17489, Greifswald, Germany.
  • Dörte Wittenburg
    Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology, 18196, Dummerstorf, Germany. wittenburg@fbn-dummerstorf.de.