Large language model-based biological age prediction in large-scale populations.

Journal: Nature medicine
Published Date:

Abstract

Accurate and convenient assessment of individual aging is crucial for identifying health risks and preventing aging-related diseases. Nonetheless, current aging proxies often face challenges such as methodological limitations, weak associations with adverse outcomes and limited generalizability. Here we propose a framework that leverages large language models (LLMs) to estimate individual overall and organ-specific aging using only health examination reports. We validated this approach across six population-based cohorts, encompassing over 10 million participants and demonstrated effectiveness and reliability. Our results showed that the LLM-predicted overall age achieved a concordance index (C-index) of 0.757 (95% CI 0.752-0.761) for all-cause mortality, significantly outperforming other aging proxies such as telomere length, frailty index, eight epigenetic ages and four machine-learning models predictions. The overall age gap was strongly associated with multiple aging-related phenotypes and health outcomes, showing a hazard ratio of 1.055 (95% CI 1.050-1.060) for all-cause mortality. For organ-specific aging, LLM-predicted ages and age gaps also demonstrated superior performance in predicting corresponding organ-specific diseases compared to machine-learning models. Additionally, we examined the dynamic aging assessment capability of LLMs and applied age gaps to identify proteomic biomarkers associated with accelerated aging and to develop risk prediction models of 270 diseases. Interpretability analyses were also conducted to explore the decision-making process of LLMs. In conclusion, our LLM-based aging assessment framework offers a precise, reliable and cost-effective approach for estimating overall and organ-specific aging. It has potential for personalized aging assessment and health management in large-scale general populations.

Authors

  • Yanjun Li
    NSF Center for Big Learning, University of Florida, Gainesville, FL.
  • Qi Huang
    State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural Universitygrid.35155.37, Wuhan, China.
  • Jin Jiang
    Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China.
  • Xusheng Du
    School of Information Science and Engineering, Xinjiang University, Urumqi, China.
  • Wenxin Xiang
    School of Medicine, Tsinghua University, Beijing, China.
  • Shiqi Zhang
    Department of Biostatistics, University of Copenhagen, Copenhagen, Denmark.
  • Zean Pan
    School of Economics and Management, Harbin Institute of Technology, Harbin, China.
  • Liyuan Zhao
    Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
  • Yuyan Cui
    School of Statistics and Mathematics, Central University of Finance and Economics, Beijing, China.
  • Limei Ke
    School of Medicine, Tsinghua University, Beijing, China.
  • Bo Yin
    College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou, 730000 People's Republic of China.
  • Linfeng Liu
    State Key Laboratory of Industrial Control Technology, Institute of Cyber Systems and Control, Zhejiang University, Hangzhou 310027, Zhejiang, China. liulinfengzju@zju.edu.cn.
  • Guoqing Feng
    Department of Computer Science and Technology, Xinzhou Teachers University, Xinzhou 034000, China.
  • Shouyi Yan
    Vanke School of Public Health, Tsinghua University, Beijing, China.
  • Liangcai Gao
    Wangxuan Institute of Computer Technology, Peking University, Beijing, China.
  • Yang Liu
    Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
  • Yujuan Yuan
    Department of Cardiology, People's Hospital of Xinjiang Uyghur Autonomous Region, Urumqi, China.
  • Yanying Guo
    Dianchi Lake Ecosystem Observation and Research Station of Yunnan Province, Kunming Dianchi & Plateau Lakes Institute, Kunming, 650228, China. Electronic address: 442570806@qq.com.
  • Yuqing Yang
    State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
  • Weizhi Ma
    Institure for AI Industry Research, Tsinghua University, Beijing, China.
  • Yining Yang
    School of Life Sciences, Tsinghua University, Beijing 100084, China.
  • Qian Di
    Vanke School of Public Health, Tsinghua University, Beijing, China. qiandi@tsinghua.edu.cn.

Keywords

No keywords available for this article.