Heterogeneous Domain Adaptation for IHC Classification of Breast Cancer Subtypes.

Journal: IEEE/ACM transactions on computational biology and bioinformatics

Published Date: Oct 24, 2018

Abstract

Increasingly, multiple parallel omics datasets are collected from biological samples. Integrating these datasets for classification is an open area of research. Additionally, whilst multiple datasets may be available for the training samples, future samples may only be measured by a single technology requiring methods which do not rely on the presence of all datasets for sample prediction. This enables us to directly compare the protein and the gene profiles. New samples with just one set of measurements (e.g., just protein) can then be mapped to this latent common space where classification is performed. Using this approach, we achieved an improvement of up to 12 percent in accuracy when classifying samples based on their protein measurements compared with baseline methods which were trained on the protein data alone. We illustrate that the additional inclusion of the gene expression or protein expression in the training process enabled the separation between the classes to become clearer.

Authors

Firat Ismailoglu
Rachel Cavill
Evgueni Smirnov
Shuang Zhou

NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing 100021, PR China. Electronic address: szhoupku@gmail.com.
Pieter Collins
Ralf Peeters

Keywords

Algorithms Breast Neoplasms Computational Biology Female Humans Immunohistochemistry Machine Learning Transcriptome

External Resources

View on PubMed Access via DOI PubMed (30369448)

Heterogeneous Domain Adaptation for IHC Classification of Breast Cancer Subtypes.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals