Morphological traits and machine learning for genetic lineage prediction of two reef-building corals.
Journal:
PloS one
Published Date:
Jun 18, 2025
Abstract
Integrating multiple lines of evidence that support molecular taxonomy analysis has proven to be a robust method for species delimitation in scleractinian corals. However, morphology often conflicts with genetic approaches due to high phenotypic plasticity and convergence. Understanding morphological variation among species is crucial to studying coral distribution, life history, ecology, and evolution. Here, we present an application of Random Forest models for coral species identification based on morphological annotation of the corallum and corallites. We show that the integration of molecular and morphological trait analysis can be improved using machine learning. Morphological traits were documented for Porites and Pocillopora coral species that were collected and genotyped through genome-wide, genetical hierarchical clustering, and coalescence analyses for the Tara Pacific Expedition. While Porites only included three tentative species, most Pocillopora species were accounted by included specimens from the western Indian Ocean, tropical Southwestern Pacific, and southeast Polynesia. Two Random Forest models per genus were trained on the morphological annotations using the genetic lineage labels. One model was developed for in-situ image identification and used corallum traits measured from in-situ photographs. Another model for integrative species identification combined corallum and corallite data measured on scanning electron micrographs. Random Forest models outperformed traditional dimension reduction methods like PCA and FAMD followed by k-means and hierarchical clustering by classifying the correct genetic lineage despite morphological clusters overlapping. This machine learning approach is reproducible, cost-effective, and accessible, reducing the need for taxonomic expertise. It can complement molecular and phylogenetic studies and support image identification, highlighting its potential to advance a coral integrative taxonomy workflow.