AstroLLaVA: towards the unification of astronomical data and natural language

Journal: arXiv

Published Date: Apr 11, 2025

Abstract

We present AstroLLaVA, a vision language model for astronomy that enables interaction with astronomical imagery through natural dialogue. By fine-tuning the LLaVA model on a diverse dataset of $\sim$30k images with captions and question-answer pairs sourced from NASA's `Astronomy Picture of the Day', the European Southern Observatory, and the NASA/ESA Hubble Space Telescope, we create a model capable of answering open-ended questions about astronomical concepts depicted visually. Our two-stage fine-tuning process adapts the model to both image captioning and visual question answering in the astronomy domain. We demonstrate AstroLLaVA's performance on an astronomical visual question answering benchmark and release the model weights, code, and training set to encourage further open source work in this space. Finally, we suggest a roadmap towards general astronomical data alignment with pre-trained language models, and provide an open space for collaboration towards this end for interested researchers.

Authors

Sharaf Zaman
Michael J. Smith
Pranav Khetarpal
Rishabh Chakrabarty
Michele Ginolfi
Marc Huertas-Company
Maja Jabłońska
Sandor Kruk
Matthieu Le Lain
Sergio José Rodríguez Méndez
Dimitrios Tanoglidis

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2504.08583v1)

AstroLLaVA: towards the unification of astronomical data and natural language

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

AstroLLaVA: towards the unification of astronomical data and natural language

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals