Wfold: A new method for predicting RNA secondary structure with deep learning.
Journal:
Computers in biology and medicine
PMID:
39341115
Abstract
Precise estimations of RNA secondary structures have the potential to reveal the various roles that non-coding RNAs play in regulating cellular activity. However, the mainstay of traditional RNA secondary structure prediction methods relies on thermos-dynamic models via free energy minimization, a laborious process that requires a lot of prior knowledge. Here, RNA secondary structure prediction using Wfold, an end-to-end deep learning-based approach, is suggested. Wfold is trained directly on annotated data and base-pairing criteria. It makes use of an image-like representation of RNA sequences, which an enhanced U-net incorporated with a transformer encoder can process effectively. Wfold eventually increases the accuracy of RNA secondary structure prediction by combining the benefits of self-attention mechanism's mining of long-range information with U-net's ability to gather local information. We compare Wfold's performance using RNA datasets that are within and across families. When trained and evaluated on different RNA families, it achieves a similar performance as the traditional methods, but dramatically outperforms the state-of-the-art methods on within-family datasets. Moreover, Wfold can also reliably forecast pseudoknots. The findings imply that Wfold may be useful for improving sequence alignment, functional annotations, and RNA structure modeling.