Incomplete multi-view gene clustering with data regeneration using Shape Boltzmann Machine.

Journal: Computers in biology and medicine
Published Date:

Abstract

Deciphering patterns in the structural and functional anatomy of genes can prove to be very helpful in understanding genetic biology and genomics. Also, the availability of the multiple omics data, along with the advent of machine learning techniques, aids medical professionals in gaining insights about various biological regulations. Gene clustering is one of the many such computation techniques that can help in understanding gene behavior. However, more comprehensive and reliable insights can be gained if different modalities/views of biomedical data are considered. However, in most multi-view cases, each view contains some missing data, leading to incomplete multi-view clustering. In this study, we have presented a deep Boltzmann machine-based incomplete multi-view clustering framework for gene clustering. Here, we seek to regenerate the data of the three NCBI datasets in the incomplete modalities using Shape Boltzmann Machines. The overall performance of the proposed multi-view clustering technique has been evaluated using the Silhouette index and Davies-Bouldin index, and the comparative analysis shows an improvement over state-of-the-art methods. Finally, to prove that the improvement attained by the proposed incomplete multi-view clustering is statistically significant, we perform Welch's t-test. AVAILABILITY OF DATA AND MATERIALS: https://github.com/piyushmishra12/IMC.

Authors

  • Pratik Dutta
    Department of Computer Science and Engineering, Indian Institute of Technology, Patna, India.
  • Piyush Mishra
    Department of Computer Science and Engineering, IIIT, Bhubaneswar, India.
  • Sriparna Saha