Optimizing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management.

Journal: Bulletin of the Technical Committee on Data Engineering
Published Date:

Abstract

Crowdsourcing is the primary means to generate training data at scale, and when combined with sophisticated machine learning algorithms, crowdsourcing is an enabler for a variety of emergent automated applications impacting all spheres of our lives. This paper surveys the emerging field of formally reasoning about and optimizing open-ended crowdsourcing, a popular and crucially important, but severely understudied class of crowdsourcing-the next frontier in crowdsourced data management. The underlying challenges include distilling the right answer when none of the workers agree with each other, teasing apart the various perspectives adopted by workers when answering tasks, and effectively selecting between the many open-ended operators appropriate for a problem. We describe the approaches that we've found to be effective for open-ended crowdsourcing, drawing from our experiences in this space.

Authors

  • Aditya Parameswaran
    University of Illinois.
  • Akash Das Sarma
    Stanford University.
  • Vipul Venkataraman
    University of Illinois.

Keywords

No keywords available for this article.