A new approach and gold standard toward author disambiguation in MEDLINE.

Journal: Journal of the American Medical Informatics Association : JAMIA
PMID:

Abstract

OBJECTIVE: Author-centric analyses of fast-growing biomedical reference databases are challenging due to author ambiguity. This problem has been mainly addressed through author disambiguation using supervised machine-learning algorithms. Such algorithms, however, require adequately designed gold standards that reflect the reference database properly. In this study we used MEDLINE to build the first unbiased gold standard in a reference database and improve over the existing state of the art in author disambiguation.

Authors

  • Dina Vishnyakova
    Roche Pharmaceutical Research and Early Development, pRED Informatics, Roche Innovation Center, Basel, Switzerland.
  • Raul Rodriguez-Esteban
    Roche Pharmaceutical Research and Early Development, pRED Informatics, Roche Innovation Center, Basel, Switzerland.
  • Fabio Rinaldi
    University of Zurich, Institute of Computational Linguistics and Swiss Institute of Bioinformatics, Andreasstrasse 15, Zürich, CH-8050, Switzerland. rinaldi@cl.uzh.ch.