Fold-LTR-TCP: protein fold recognition based on triadic closure principle.

Journal: Briefings in bioinformatics
Published Date:

Abstract

As an important task in protein structure and function studies, protein fold recognition has attracted more and more attention. The existing computational predictors in this field treat this task as a multi-classification problem, ignoring the relationship among proteins in the dataset. However, previous studies showed that their relationship is critical for protein homology analysis. In this study, the protein fold recognition is treated as an information retrieval task. The Learning to Rank model (LTR) was employed to retrieve the query protein against the template proteins to find the template proteins in the same fold with the query protein in a supervised manner. The triadic closure principle (TCP) was performed on the ranking list generated by the LTR to improve its accuracy by considering the relationship among the query protein and the template proteins in the ranking list. Finally, a predictor called Fold-LTR-TCP was proposed. The rigorous test on the LE benchmark dataset showed that the Fold-LTR-TCP predictor achieved an accuracy of 73.2%, outperforming all the other competing methods.

Authors

  • Bin Liu
    Department of Endocrinology, the First Affiliated Hospital of Chongqing Medical University, Chongqing, China; Department of Endocrinology, Neijiang First People's Hospital, Chongqing, China.
  • Yulin Zhu
    School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China.
  • Ke Yan
    Department of Biostatistics, Medical College of Wisconsin, Milwaukee, Wis.