Delivering artificial intelligence-ready genomics with the Maize Genetics and Genomics Database.

Journal: Genetics
Published Date:

Abstract

The integration of artificial intelligence (AI) into computational biology is changing biological research, particularly in agriculture, where large and complex datasets offer opportunities for discovery and crop improvement. Maize (Zea mays L.), a globally critical crop with extensive genomic, genetic, proteomic, and functional resources, stands to benefit from AI integration. The Maize Genetics and Genomics Database (MaizeGDB) is proactively building an AI-ready infrastructure by standardizing datasets, precomputing complex features, developing novel interactive tools, and providing reproducible workflows. This paper details MaizeGDB's strategic initiatives to create a foundation of AI-ready data in standardized formats and generate precomputed embeddings from cutting-edge DNA and protein language models. We introduce new functionalities, including zero-shot variant effect scoring derived from biological language models (protein and DNA) and genome browser tracks for visualizing nucleotide conservation (conveying potential functional significance). Furthermore, we provide custom dataset assembly resources and reproducible workflows via GitHub. By providing access to and organization of maize data, MaizeGDB enables the maize research and breeding community to leverage AI for the accelerated discovery of gene function, variant interpretation, and the development of improved maize varieties.

Authors

  • Olivia C Haley
    USDA Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, Iowa, United States 50011.
  • Laura E Tibbs-Cortes
    USDA-ARS, Wheat Health, Genetics, and Quality Research Unit, Pullman, WA 99164, USA; USDA-ARS, Corn Insects and Crop Genetics Research Unit, Ames, IA 50011, USA.
  • Stephen F Harding
    USDA Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, Iowa, United States 50011.
  • Elly Poretsky
    Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Western Regional Research Center, 800 Buchanan St, Albany, CA 94710, United States.
  • Ethalinda K Cannon
    USDA Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, Iowa, United States 50011.
  • John L Portwood
    USDA Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, Iowa, United States 50011.
  • Jack M Gardiner
    Division of Animal Sciences, University of Missouri, S134D Animal Science Research Center, 920 East Campus Drive, Columbia, MO, 65211, USA.
  • Taner Z Sen
    Crop Improvement and Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Western Regional Research Center, 800 Buchanan St, Albany, CA 94710, United States.
  • Hye-Seon Kim
    USDA Agricultural Research Service, National Center for Agricultural Utilization Research, Mycotoxin Prevention and Applied Microbiology Research Unit, 1815 N University St., Peoria, Illinois, United States 61604.
  • Margaret R Woodhouse
    USDA Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, 819 Wallace Rd., Ames, Iowa, United States 50011.
  • Carson M Andorf
    Agricultural Research Services, United States Department of Agriculture, Ames, Iowa, United States of America.

Keywords

No keywords available for this article.