Natural language processing in text mining for structural modeling of protein complexes.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Structural modeling of protein-protein interactions produces a large number of putative configurations of the protein complexes. Identification of the near-native models among them is a serious challenge. Publicly available results of biomedical research may provide constraints on the binding mode, which can be essential for the docking. Our text-mining (TM) tool, which extracts binding site residues from the PubMed abstracts, was successfully applied to protein docking (Badal et al., PLoS Comput Biol, 2015; 11: e1004630). Still, many extracted residues were not relevant to the docking.

Authors

  • Varsha D Badal
    Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047, USA.
  • Petras J Kundrotas
    Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047, USA. pkundro@ku.edu.
  • Ilya A Vakser
    Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047, USA. vakser@ku.edu.