Predicting Functional Surface Topographies Combining Topological Data Analysis and Deep Learning Across the Human Protein Universe.

Journal: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PMID:

Abstract

Characterizing geometric and topological properties of protein structures encompassing surface pockets, interior cavities, and cross channels is important for understanding their functions. Our knowledge of protein structures has been greatly advanced by AI-powered structure prediction tools, with AlphaFold2 (AF2) providing accurate 3D structure predictions for most protein sequences. Nonetheless, there is a substantial lack of function annotations and corresponding functional surface topographical information. We develop a method to predict functional pockets, along with their associated Gene Ontology (GO) terms and Enzyme Commission (EC) numbers, for a set of 65,013 AF2-predicted human non-singleton representative structures, which can be mapped to 186,095 "non-fragment" AF2-predicted human protein structures. The identification of functional pockets, along with their respective GO terms and EC numbers, is achieved by combining topological data analysis and the deep learning method of DeepFRI. All predicted functional pockets for these 65,013 AF2-predicted human representative structures are accessible at: https://cfold.bme.uic.edu/castpfold.

Authors

  • Bowei Ye
  • Jie Liang