Neural Collective Matrix Factorization for integrated analysis of heterogeneous biomedical data.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: In many biomedical studies, there arises the need to integrate data from multiple directly or indirectly related sources. Collective matrix factorization (CMF) and its variants are models designed to collectively learn from arbitrary collections of matrices. The latent factors learnt are rich integrative representations that can be used in downstream tasks, such as clustering or relation prediction with standard machine-learning models. Previous CMF-based methods have numerous modeling limitations. They do not adequately capture complex non-linear interactions and do not explicitly model varying sparsity and noise levels in the inputs, and some cannot model inputs with multiple datatypes. These inadequacies limit their use on many biomedical datasets.

Authors

  • Ragunathan Mariappan
    Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore.
  • Aishwarya Jayagopal
    Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore.
  • Ho Zong Sien
    Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore.
  • Vaibhav Rajan
    Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore 117417, Singapore.