Crowdsourcing a HIT: Measuring Workers’ Pre-task Interactions on Microtask Markets with Turkmill

Abstract

The ability to entice and engage crowd workers to participate in human intelligence tasks (HITs) is critical for many human computation systems and large-scale experiments. While various metrics have been devised to measure and improve the quality of worker output via task designs, effective recruitment of crowd workers is often overlooked. To help us gain a better understanding of crowd recruitment strategies we propose three new metrics for measuring crowd workers’ willingness to participate in advertised HITs: conversion rate, conversion rate over time, and nominal conversion rate. We discuss how the conversion rate of workers—the number of potential workers aware of a task that choose to accept the task—can affect the quantity, quality, and validity of any data collected via crowdsourcing. We also contribute a tool—turkmill—that enables requesters on Amazon Mechanical Turk to easily measure the conversion rate of HITs. We then present the results of two experiments that demonstrate how conversion rate metrics can be used to evaluate the effect of different HIT designs. We investigate how four HIT design features (value proposition, branding, quality of presentation, and intrinsic motivation) affect conversion rates. Among other things, we find that including a clear value proposition has a strong significant, positive effect on the nominal conversion rate. We also find that crowd workers prefer commercial entities to non-profit or university requesters.

Authors

Jason T. Jacques PhD student studying at the University of St Andrews
Per Ola Kristensson Lecturer in Human Computer Interaction in the School of Computer Science at the University of St Andrews

Turkmill

Turkmill is a log processor for Apache httpd log files allowing pre-task information regarding users of Amazon Mechanical Turk to be extracted. Data collected includes:

IP address (IPv4 and IPv6 supported)
Worker ID
Assignment ID
Preview time
Accept time
Complete time (some HITs only)
Geo-location (using GeoLite by MaxMind*)

Usage

$ turkmill
Usage: turkmill [-updategeoip] [-novisitors] [-postback] webpath logfile ...

$ turkmill web/path/to/hit /var/log/apache2/access_log /var/log/apache2/access_log2

Crowdsourcing a HIT: Measuring Workers’ Pre-task Interactions on Microtask Markets with Turkmill

Abstract

Authors

Downloads

Turkmill

Usage