Missing value imputation for microRNA expression data by using a GO-based similarity measure.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Missing values are commonly present in microarray data profiles. Instead of discarding genes or samples with incomplete expression level, missing values need to be properly imputed for accurate data analysis. The imputation methods can be roughly categorized as expression level-based and domain knowledge-based. The first type of methods only rely on expression data without the help of external data sources, while the second type incorporates available domain knowledge into expression data to improve imputation accuracy. In recent years, microRNA (miRNA) microarray has been largely developed and used for identifying miRNA biomarkers in complex human disease studies. Similar to mRNA profiles, miRNA expression profiles with missing values can be treated with the existing imputation methods. However, the domain knowledge-based methods are hard to be applied due to the lack of direct functional annotation for miRNAs. With the rapid accumulation of miRNA microarray data, it is increasingly needed to develop domain knowledge-based imputation algorithms specific to miRNA expression profiles to improve the quality of miRNA data analysis.

Authors

  • Yang Yang
    Department of Gastrointestinal Surgery, The Third Hospital of Hebei Medical University, Shijiazhuang, China.
  • Zhuangdi Xu
    Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai, 200240, China. xzdandy@gmail.com.
  • Dandan Song
    Key Laboratory of Luminescence and Optical Information, Beijing Jiaotong University, Ministry of Education Beijing 100044 China zhengxu@bjtu.edu.cn ddsong@bjtu.edu.cn.