The Helitron family classification using SVM based on Fourier transform features applied on an unbalanced dataset.

Journal: Medical & biological engineering & computing
Published Date:

Abstract

Helitrons are mobile sequences which belong to the class 2 of eukaryotic transposons. Their specificity resides in their mechanism of transposition: the rolling circle mechanism. They play an important role in remodeling proteomes due to their ability to modify existing genes and introducing new ones. A major difficulty in identifying and classifying Helitron families comes from the complex structure, the unspecified length, and the unbalanced appearance number of each Helitron type. The Helitron's recognition is still not solved in literature. The purpose of this paper is to characterize and classify Helitron types using spectral features and support vector machine (SVM) classification technique. Thus, the helitronic DNA is transformed into a numerical form using the FCGS coding technique. Then, a set of spectral features is extracted from the smoothed Fourier transform applied on the FCGS signals. Based on the spectral signature and the classification's confusion matrix, we demonstrated that some specific classes which do not show similarities, such as HelitronY2 and NDNAX3, are easily discriminated with important accuracy rates exceeding 90%. However, some Helitron types have great similarities such as the following: Helitron1, HelitronY1, HelitronY1A, and HelitronY4. Our system is also able to predict them with promising values reaching 70%. Graphical abstract The Helitron recognizer based on features extracted from smoothed Fourier transform.

Authors

  • Rabeb Touati
    LR99ES10 Human Genetics Laboratory, Faculty of Medicine of Tunis (FMT), University of Tunis El Manar, Tunis, Tunisia. Rabeb.touati@enit.utm.tn.
  • Afef Elloumi Oueslati
    SITI Laboratory, National School of Engineers of Tunis (ENIT), University Tunis El Manar, BP 37, le Belvédère, 1002, Tunis, Tunisia.
  • Imen Messaoudi
    SITI Laboratory, National School of Engineers of Tunis (ENIT), University Tunis El Manar, BP 37, le Belvédère, 1002, Tunis, Tunisia.
  • Zied Lachiri
    SITI Laboratory, National School of Engineers of Tunis (ENIT), University Tunis El Manar, BP 37, le Belvédère, 1002, Tunis, Tunisia.