A PLM-Based Method for Predicting Protein Ion Channel Modulators for Drug Discovery and Safety Evaluation
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
Ion channels are central to regulating neuronal communication, cardiac rhythm, and muscle contraction. Their modulation can induce therapeutic benefits but may also lead to adverse or toxic effects. This study presents IonNTXpred, a protein language model (PLM)-based method for predicting protein ion channel modulators, including channel-specific (Na□, K□, Ca²□, and others) and moonlighting proteins capable of modulating multiple ion channels. We train, test, and evaluate our models on the largest dataset of non-redundant ion channel–modulating proteins, where no two proteins have more than 40% sequence identity. Composition analysis revealed that residues Cys, Gly, and Trp are highly prevalent, whereas Ala, Glu, Leu, Gln, and Val are scarce in ion channel–modulating proteins, with Cys identified as a key discriminative residue. We explored both alignment-based (BLAST, MERCI) and alignment-free (machine learning, deep learning, and PLM-based) approaches. Among these, our evolutionary information–based PLM (ESM2-t33) achieved the best performance, with an AUROC of 0.97 across ion channels, which further improved to 0.98 when integrated with BLAST output. The proposed method outperformed existing approaches on independent datasets. We used IonNTXpred to screen FDA-approved and organismal proteins to identify candidates with ion channel–modulating potential, supporting drug repurposing, discovery of new therapeutic proteins, and safety assessment of existing biologics. We implemented these models in a user-friendly web server, IonNTXpred, which facilitates the design and discovery of ion channel–modulating protein-based drugs and supports the biosafety evaluation of therapeutic proteins through neurotoxin screening (https://webs.iiitd.edu.in/raghava/ionntxpred/). Prediction, design, and large-scale screening of ion channel–modulating proteins. Identifies neurotoxins to support safety evaluation of therapeutic proteins. Evolutionary information–based protein language model ESM2-t33 for prediction. Predicts moonlighting proteins capable of modulating multiple ion channels. Provides standalone software, a web server, and a PyPI package for user accessibility.