CARIBOU: Computational AI Research Interface for Bioinformatics, Omics, and Unifying Agents

Journal: bioRxiv
Published Date:

Abstract

Single-cell and spatial omics technologies are generating biological datasets at a scale that increasingly exceeds the capacity of expert manual analysis. Although large language models (LLMs) can generate bioinformatics code, most existing systems remain limited by stateless execution, poor reproducibility, restricted deployment environments, and an inability to adapt analyses in response to intermediate results. Here, we present CARIBOU (Computational AI Research Interface for Bioinformatics, Omics, and Unifying Agents), a multi-agent AI framework designed for autonomous bioinformatics analysis within institutional high-performance computing (HPC) environments. CARIBOU organizes specialized AI agents through researcher-editable blueprints that encode analytical roles, workflow guidance, and domain-specific reasoning while grounding all analyses in persistent executable computational environments compatible with Singularity/Apptainer-based HPC systems. Unlike static code-generation approaches, CARIBOU maintains a shared evolving analytical state across workflow stages, enabling iterative execute-observe-correct behavior during quality control, batch integration, clustering, and cell-type annotation. We evaluate CARIBOU across unit-task benchmarks, metadata reconstruction challenges, and end-to-end single-cell RNA-seq analyses using Allen Brain Atlas hippocampus and Tabula Sapiens datasets. Across these tasks, iterative execution consistently outperformed one-shot code generation, while the framework demonstrated adaptive recovery from execution failures and compatibility with security-constrained research computing infrastructure.

Authors

  • Riffle
  • D.; Shirooni
  • N.; Sureshkumar
  • P.; Vijay
  • V.; Rose
  • M. F.

Categories