Correcting for experiment-specific variability in expression compendia can remove underlying signals.

Journal: GigaScience
Published Date:

Abstract

MOTIVATION: In the past two decades, scientists in different laboratories have assayed gene expression from millions of samples. These experiments can be combined into compendia and analyzed collectively to extract novel biological patterns. Technical variability, or "batch effects," may result from combining samples collected and processed at different times and in different settings. Such variability may distort our ability to extract true underlying biological patterns. As more integrative analysis methods arise and data collections get bigger, we must determine how technical variability affects our ability to detect desired patterns when many experiments are combined.

Authors

  • Alexandra J Lee
    Genomics and Computational Biology Graduate Program, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA, 19104, USA.
  • YoSon Park
    Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA, 19104, USA.
  • Georgia Doing
    Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
  • Deborah A Hogan
    Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
  • Casey S Greene
    Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, United States; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, United States; Institute for Translational Medicine and Therapeutics, University of Pennsylvania, Perelman School of Medicine, University of Pennsylvania, United States. Electronic address: csgreene@upenn.edu.