Leveraging permutation testing to assess confidence in positive-unlabeled learning applied to high-dimensional biological datasets.

Journal: BMC bioinformatics

Published Date: Jun 19, 2024

Abstract

BACKGROUND: Compared to traditional supervised machine learning approaches employing fully labeled samples, positive-unlabeled (PU) learning techniques aim to classify "unlabeled" samples based on a smaller proportion of known positive examples. This more challenging modeling goal reflects many real-world scenarios in which negative examples are not available-posing direct challenges to defining prediction accuracy and robustness. While several studies have evaluated predictions learned from only definitive positive examples, few have investigated whether correct classification of a high proportion of known positives (KP) samples from among unlabeled samples can act as a surrogate to indicate model quality.

Authors

Shiwei Xu

Research Center for Agricultural Monitoring and Early Warning, Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing, China.
Margaret E Ackerman

Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire, United States of America.

Keywords

Algorithms Computational Biology Humans Machine Learning Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (38898392)

Leveraging permutation testing to assess confidence in positive-unlabeled learning applied to high-dimensional biological datasets.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals