CrowdML - ICML ’15 Workshop on Crowdsourcing and Machine Learning

Information

Location: ICML 2015 Workshop, Lille, France
Room: Faidherbe
Posters: Boards are located in the "Espace Flandre"
Date: July 10, 2015
Schedule outline is available here
Title+Abstract for invited talks are available here
Resources: See more information on CrowdML site
Sponsors: We acknowledge the support of Microsoft Research and University of Southampton - ORCHID project for partially sponsoring CrowdML at ICML'15.
Contact: For any questions, please email icml15crowd AT gmail

Overview

Crowdsourcing and human computing aims to combine human knowledge and expertise with computing to help solve problems and scientific challenges that neither machines nor humans can solve alone. Crowdsourcing is impacting the ability of academic researchers to build new systems and run new experiments involving people, and is also gaining a lot of use within industry for collecting training data for the purpose of machine learning. In addition to a number of human-powered scientific projects, including GalaxyZoo, eBird, and Foldit; there are various online marketplaces for crowdsourcing, including Amazon's Mechanical Turk, ODesk and MobileWorks. The fundamental question that we plan to explore in this workshop is:

How can we build systems that combine the intelligence of humans and the computing power of machines for solving challenging scientific and engineering problems?

The goal is to improve the performance of complex human-powered systems by making them more efficient, robust, and scalable.

Current research in crowdsourcing often focuses on micro-tasking (for example, labeling a set of images) and designing algorithms by considering simplistic models of workers' behavior. However, the participants are people with rich capabilities including learning, collaboration and so forth, suggesting the need for more nuanced approaches that place special emphasis on the participants and their interaction with the overall system. More importantly, building systems that seamlessly integrate machine learning and crowdsourcing techniques can greatly push the frontier of our ability to solve challenging large-scale problems. This poses many interesting research questions and exciting opportunities for the machine learning community. The goal of this workshop is to foster these ideas and work towards this goal by bringing together experts from the field of machine learning, cognitive science, economics, game theory, and human-computer interaction.

Organizers

Adish Singla. ETH Zurich.
Matteo Venanzi. University of Southampton.
Rafael M. Frongillo. Harvard University.

Talks

Alya Abbott and Ioannis Antonellis, Upwork (Elance-oDesk).
Daoud Clarke, Lumi.do.
Jeffrey P. Bigham, Carnegie Mellon University.
Julian Eisenschlos, Facebook's Crowdsourcing Team.
Long Tran-Thanh, University of Southampton.
Matthew Lease, University of Texas, Austin.
Mausam, Indian Institute of Technology Delhi.
Victor Naroditskiy, OneMarketData.

Accepted Papers

"A Unified Framework for Human-Powered Categorization"; Xian Wu, Jian Li, Guoliang Li.
"Forecasting Crowd Work Quality via Multi-dimensional Features of Workers"; Hyun Joon Jung, Matthew Lease.
"Crowdsourced Labels from Multiple Contexts"; Luke Dickens, Emil Lupu.
"Evolution of Content Moderation Approaches for Online Classifieds: From Action Recommendations to Automation"; Ivan Guz, Vasily Leksin, Mikhail Trofimov, Aleksandra Fenster.
"Reactive Learning: Actively Trading Off Larger Noisier Training Sets Against Smaller Cleaner Ones"; Christopher Lin, Mausam, Daniel Weld.
"Forecast Aggregation using Imputed Accuracy"; Jens Witkowski, Pavel Atanasov, Lyle H. Ungar.
"Crowdsourcing Feature Discovery via Adaptively Chosen Comparisons"; James Zou, Kamalika Chaudhuri, Adam Kalai.
"Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines"; Mehdi Sajjadi, Morteza Alamgir, Ulrike von Luxburg.
"Driverseat: Crowdstrapping Learning Tasks for Autonomous Driving"; Pranav Rajpurkar, Toki Migimatsu, Jeff Kiske, Royce Cheng-Yue, Sameep Tandon, Tao Wang, Andrew Ng.
"Clustering by Pairwise Comparisons: Stochastic Block Model for Inference in Crowdsourcing"; Ramya Korlakai Vinayak, Babak Hassibi.
"Learning on the Job: Optimal Instruction for Crowdsourcing"; Jonathan Bragg, Mausam, Daniel Weld.
"Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence"; Nihar Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin Wainwright.
"POMDP-Based Worker Pool Selection for Crowdsourcing"; Shreya Rajpal, Karan Goel, Mausam.
"Personalized Exams and Learning in Massive Open Online Courses"; Pushkar Kolhe, Michael L. Littman, Charles I. Isbell.
"On Yahoo Answers, Long Answers are Best"; Alina Beygelzimer, Ruggiero Cavallo, Joel Tetrault.
"Predicting Bad Job Outcomes in Online Workplaces"; Aaron Michelony, Ioannis Antonellis, Ramesh Johari.

Topics of Interest

Topics of interests in the workshop include (but are not limited to):

Machine Learning with strategic agents. Machine learning algorithms (for instance, active learning by querying experts, information gathering from sensors) are typically designed to interact with non-strategic components (sensor nodes, machines, or non-strategic people). The human-powered systems present a big paradigm shift as these components are being replaced by strategic agents, for example, workers in crowdsourcing systems aiming to maximize their profits, students or participants as learning entities, smartphones of the people as sensor nodes, and so on. This poses a number of challenges and interesting research questions, such as using statistical techniques to understand, model and learn human behavior, and designing robust ML systems that can deal with intrinsic noise and strategic behavior of these components controlled by human agents.

Incentives, pricing mechanisms and budget allocation. How can we design the right incentive structure and pricing policies for participants that maximize the satisfaction of participants as well as utility of the job requester for a given budget? How can techniques from machine learning, economics and game theory be used to learn optimal pricing policies and to infer optimal incentive designs?

Task decomposition and knowledge aggregation. How can complex crowdsourcing tasks be decomposed into simpler micro-tasks that can be performed by individuals or small groups with relatively little effort? How can we design models and algorithms to effectively aggregate responses and knowledge, especially for complex tasks?

Learning by participants and peer evaluation. How can we use insights from machine learning to build tools for training and teaching the participants (workers/students) in the crowdsourcing systems for carrying out difficult tasks and in MOOCs? How can this training be actively adapted based on the skills or expertise of the participants and by tracking the learning process? Peer evaluation schemes including peer prediction and information elicitation, and their applications to peer-grading in MOOCs, peer review, crowdsourced data labeling, incentivizing effort, etc. are also relevant topics.

Social aspects and collaboration. With ever-increasing time on the Internet being spent on online social networks, there is a huge opportunity to elicit useful contributions from users at scale, by carefully designing tasks. How can online social networks be used to create tasks with a gamification component and engage users in useful activities? How can systems exploit the underlying social ties of the participants to create incentives for users to collaborate?

Human-in-the-loop ML systems. How can we build practical systems that seamlessly integrate machine and human intelligence, a.k.a. human-in-the-loop ML systems? Machine learning algorithms can help the crowdsourcing component to manage workflows and control workers' qualities, while the crowds can be used to handle tasks that are difficult for machines to adaptively boost the performance of machine learning algorithms. Active learning and decision-theoretic techniques can be helpful to actively adapt the workflow of tasks that are given to workers.

Open theoretical questions, challenges, and novel applications. The scale of the Internet, the fast-evolving landscape of technology, and the increase of mobile computing is creating tremendous opportunities and challenges for the design of new human-powered systems. What are the open research questions, emerging trends and novel applications at the intersection of crowdsourcing and machine learning? This workshop will encourage visionary position papers to discuss this further.

Call for Papers

Submissions should follow the ICML 2015 format and are encouraged to be up to eight pages, excluding references. Additional appendices are allowed. Papers submitted for review do not need to be anonymized. There will be no official proceedings, but the accepted papers will be made available on the workshop website. Accepted papers will be either presented as a talk or poster. We welcome submissions both on novel research work as well as extended abstracts on work recently published or under review in another conference or journal (please state the venue of publication in the latter case); we particularly encourage submission of visionary position papers on the emerging trends on the field.

Please submit papers in PDF format here. For any questions, please email icml15crowd AT gmail.

For sharing CFP in email, you can use the plain text format available here.