Chimeric mis-annotations of genes remain pervasive in eukaryotic non-model organisms.

Journal: BMC genomics
Published Date:

Abstract

BACKGROUND: Accurate annotation of protein-coding genes is critical for genome analysis in non-model organisms. However, limited RNA-Seq data and incomplete protein resources can lead to errors, including chimeric gene mis-annotations, where two or more adjacent genes are incorrectly fused into a single model. These errors often persist due to annotation inertia, where mistakes are propagated and amplified through data sharing and reanalysis, and leads to cases where the mis-annotated model becomes favoured over the correct model. This complicates almost all downstream genomic analyses such as gene expression studies and comparative genomics.

Authors

  • Andreas Bachler
    CSIRO, Black Mountain Laboratories, Clunies Ross Street, Canberra, ACT, 2601, Australia. Andy.Bachler@csiro.au.
  • Thomas K Walsh
    CSIRO, Black Mountain Laboratories, Clunies Ross Street, Canberra, ACT, 2601, Australia.
  • Rahul V Rane
    CSIRO, 351 Royal Parade, Parkville, VIC, 3052, Australia.
  • Gunjan Pandey
    CSIRO, Black Mountain Laboratories, Clunies Ross Street, Canberra, ACT, 2601, Australia.

Keywords

No keywords available for this article.