Imputing single-cell protein abundance in multiplex tissue imaging.

Journal: Nature communications
Published Date:

Abstract

Multiplex tissue imaging enables single-cell spatial proteomics and transcriptomics but remains limited by incomplete molecular profiling, tissue loss, and probe failure. Here, we apply machine learning to impute single-cell protein abundance using multiplex tissue imaging data from a breast cancer cohort. We evaluate regularized linear regression, gradient-boosted trees, and deep learning autoencoders, incorporating spatial context to enhance imputation accuracy. Our models achieve mean absolute errors between 0.05-0.3 on a [0,1] scale, closely approximating ground truth values. Using imputed data, we classify single cells as pre- or post-treatment, demonstrating their biological relevance. These findings establish the feasibility of imputing missing protein abundance, highlight the advantages of spatial information, and support machine learning as a powerful tool for improving single-cell tissue imaging.

Authors

  • Raphael Kirchgaessner
    Biomedical Engineering, Oregon Health and Science University, 3181 S.W. Sam Jackson Park Road, Portland, OR, 97239-3098, USA.
  • Cameron Watson
    Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
  • Allison Creason
    Department of Biomedical Engineering, Oregon Health & Science University, Portland, Oregon, United States of America.
  • Kaya Keutler
    Department of Chemical Physiology and Biochemistry, Oregon Health & Science University, Portland, OR, USA.
  • Jeremy Goecks
    Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA. Electronic address: goecksj@ohsu.edu.