Deep Learning of Protein Structure and Physicochemical Properties from Two-Dimensional Infrared Spectra.

Journal: The journal of physical chemistry letters
Published Date:

Abstract

Protein structure and physicochemical properties are central to stability, interactions, and biological function, yet their direct determination remains challenging, particularly for dynamic and heterogeneous conformational ensembles. Two-dimensional infrared (2DIR) spectroscopy provides vibrational signatures that are highly sensitive to protein structure and dynamics; however, quantitatively relating complex 2DIR spectra to underlying structural and physicochemical information remains a challenging inverse problem. Here, we present a data-driven computational framework for inferring protein structural representations and physicochemical properties from 2DIR spectra. We construct a data set of 631,651 computed 2DIR spectra from both static protein structures and molecular dynamics trajectories, providing a unified basis for learning "Spectrum-Structure-Property" relationships across diverse conformational states. Multiscale spectral features are extracted to predict protein Cα distance maps for three-dimensional structure reconstruction, as well as several physicochemical descriptors, including secondary-structure content, radius of gyration, hydrogen-bond counts, and buried residue fraction. The proposed framework achieves consistent performance across diverse protein systems and can be extended, with limited refinement, to independent dynamic trajectories. These results demonstrate that structural and physicochemical information can be inferred from simulated 2DIR spectra, providing a computational proof-of-concept for establishing quantitative connections between vibrational spectra and protein structure and properties. Further validation on experimental 2DIR data will be required to assess practical applicability.

Authors

Keywords

No keywords available for this article.