Similarity corpus on microbial transcriptional regulation.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: The ability to express the same meaning in different ways is a well-known property of natural language. This amazing property is the source of major difficulties in natural language processing. Given the constant increase in published literature, its curation and information extraction would strongly benefit from efficient automatic processes, for which corpora of sentences evaluated by experts are a valuable resource.

Authors

  • Oscar Lithgow-Serrano
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México. olithgow@ccg.unam.mx.
  • Socorro Gama-Castro
    Computational Genomics Program, Center for Genomic Sciences, National Autonomous University of Mexico, Av. Universidad, s/n, Colonia Chamilpa, Cuernavaca, Morelos 62100, Mexico.
  • Cecilia Ishida-Gutiérrez
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México.
  • Citlalli Mejía-Almonte
    Computational Genomics Program, Center for Genomic Sciences, National Autonomous University of Mexico, Av. Universidad, s/n, Colonia Chamilpa, Cuernavaca, Morelos 62100, Mexico.
  • Víctor H Tierrafría
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México.
  • Sara Martínez-Luna
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México.
  • Alberto Santos-Zavaleta
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México.
  • David Velázquez-Ramírez
    Computational Genomics, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM). A.P., 565-A Cuernavaca, Morelos, 62100, México.
  • Julio Collado-Vides
    Computational Genomics Program, Center for Genomic Sciences, National Autonomous University of Mexico, Av. Universidad, s/n, Colonia Chamilpa, Cuernavaca, Morelos 62100, Mexico.