Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for various cancer types, but we still lack the computational methods to reliably integrate this information in a meaningful manner. Therefore, we apply and extend current multiple kernel learning for dimensionality reduction approaches. On the one hand, we add a regularization term to avoid overfitting during the optimization procedure, and on the other hand, we show that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand.

Authors

  • Nora K Speicher
    Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbrücken and Saarbrücken Graduate School of Computer Science, Saarland University, 66123 Saarbrücken Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbrücken and Saarbrücken Graduate School of Computer Science, Saarland University, 66123 Saarbrücken.
  • Nico Pfeifer
    Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbrücken and Saarbrücken Graduate School of Computer Science, Saarland University, 66123 Saarbrücken.