Integrating deep learning, biological hierarchies, and high-resolution imagery to create a new identification tool for cryptic coral reef fishes.

Journal: PloS one
Published Date:

Abstract

Life on Earth has evolved into a staggering diversity of species, most of which still remain undiscovered, unrecognized, or unmonitored. As our ocean's richest biodiversity hotspot, coral reefs harbor more than one third of marine biodiversity, but many reef species are small and cryptic and, therefore, difficult to identify and study. Among these, tiny bottom-dwelling ('cryptobenthic') fishes have been highlighted as a highly diverse (>3,000 species), understudied, and ecologically important group. However, the classification and monitoring of these fishes depend almost exclusively on the knowledge of few expert scientists, which has resulted in limited knowledge concerning the taxonomy, distribution, and population trends of these fishes. Deep learning-driven image classification-known for its ability to learn complex patterns in visual data-is an ideal candidate for automating taxonomic image classification and therefore broaden participation in ecological monitoring and biodiversity science. We developed CryptoVision, a new taxonomy-aware convolutional neural network with three output heads that explicitly considers taxonomic hierarchies (family, genus, species) and their biological constraints. Built on ResNet50v2 and enhanced with Squeeze-and-Excitation modules, CryptoVision employs a custom taxonomy-focal cross-entropy loss and four hierarchical fusion strategies (standard, concatenation, gating, attention) to assess the algorithm's performance. Trained on a unique dataset of ~7,600 laboratory-standard and ~18,800 web-sourced images covering 113 species of small reef fishes, our tool highlights the power of integrating deep learning with innovative, taxonomically-informed design and high-resolution imagery. Indeed, CryptoVision achieved a ~ 25% improvement across all metrics when lab-standard imagery was incorporated and among the fusion variants, the gating approach delivered the best calibration (expected calibration error ≈ 0.01) and 90.5% average precision. Finally, guided saliency map analyses of species in the dwarfgoby genus Eviota illustrate that model attention can align with expert-defined morphological traits that represent critical features for species delimitation. Our results demonstrate that taxonomy-aware, multi-output deep learning on curated imagery provides a robust, interpretable framework for scalable biodiversity monitoring, ecological research, and streamlined taxonomic workflows that is particularly well-suited for the many taxa that are typically understudied due to their small size, cryptic nature, or ambiguous taxonomy.

Authors

Keywords

No keywords available for this article.