Baichuan-Omni-1.5 Technical Report

Journal: arXiv
Published Date:

Abstract

We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pipeline for multimodal data, obtaining about 500B high-quality data (text, audio, and vision). Second, an audio-tokenizer (Baichuan-Audio-Tokenizer) has been designed to capture both semantic and acoustic information from audio, enabling seamless integration and enhanced compatibility with MLLM. Lastly, we designed a multi-stage training strategy that progressively integrates multimodal alignment and multitask fine-tuning, ensuring effective synergy across all modalities. Baichuan-Omni-1.5 leads contemporary models (including GPT4o-mini and MiniCPM-o 2.6) in terms of comprehensive omni-modal capabilities. Notably, it achieves results comparable to leading models such as Qwen2-VL-72B across various multimodal medical benchmarks.

Authors

  • Yadong Li
  • Jun Liu
  • Tao Zhang
  • Tao Zhang
  • Song Chen
  • Tianpeng Li
  • Zehuan Li
  • Lijun Liu
  • Lingfeng Ming
  • Guosheng Dong
  • Da Pan
  • Chong Li
  • Yuanbo Fang
  • Dongdong Kuang
  • Mingrui Wang
  • Chenglin Zhu
  • Youwei Zhang
  • Hongyu Guo
  • Fengyu Zhang
  • Yuran Wang
  • Bowen Ding
  • Wei Song
  • Xu Li
  • Yuqi Huo
  • Zheng Liang
  • Shusen Zhang
  • Xin Wu
  • Shuai Zhao
  • Linchu Xiong
  • Yozhen Wu
  • Jiahui Ye
  • Wenhao Lu
  • Bowen Li
  • Yan Zhang
  • Yaqi Zhou
  • Xin Chen
  • Lei Su
  • Hongda Zhang
  • Fuzhong Chen
  • Xuezhen Dong
  • Na Nie
  • Zhiying Wu
  • Bin Xiao
  • Ting Li
  • Shunya Dang
  • Ping Zhang
  • Yijia Sun
  • Jincheng Wu
  • Jinjie Yang
  • Xionghai Lin
  • Zhi Ma
  • Kegeng Wu
  • Jia li
  • Aiyuan Yang
  • Hui Liu
  • Jianqiang Zhang
  • Xiaoxi Chen
  • Guangwei Ai
  • Wentao Zhang
  • Yicong Chen
  • Xiaoqin Huang
  • Kun Li
  • Wenjing Luo
  • Yifei Duan
  • Lingling Zhu
  • Ran Xiao
  • Zhe Su
  • Jiani Pu
  • Dian Wang
  • Xu Jia
  • Tianyu Zhang
  • Mengyu Ai
  • Mang Wang
  • Yujing Qiao
  • Lei Zhang
  • Yanjun Shen
  • Fan Yang
  • Miao Zhen
  • Yijie Zhou
  • Mingyang Chen
  • Fei Li
  • Chenzheng Zhu
  • Keer Lu
  • Yaqi Zhao
  • Hao Liang
  • Youquan Li
  • Yanzhao Qin
  • Linzhuang Sun
  • Jianhua Xu
  • Haoze Sun
  • Mingan Lin
  • Zenan Zhou
  • Weipeng Chen