Wednesday, February 23, 2011

The Mechanical Turk: Using Humans to do Computers' Work

People commonly believe that May 11, 1997 is the day when computers officially conquered the human race with respect to the game of chess. That was the day that the IBM computer Deep Blue bested world champion Garry Kasparov in a six game battle of wits. What most people don't realize is that in the late 18th century, a chess playing robot routinely defeated human opponents. The robot, known as the Mechanical Turk, toured Europe for decades. Crowds raved over the chess playing robot... that is until it was exposed as a complete fraud. The contraption was an elaborate hoax operated by a human hiding inside. Though the Mechanical Turk did not accomplish its original chess playing purpose, it ignominiously birthed the practice of using humans to do work too hard for computers of the day.

Mechanical Turk
Some may be more familiar with the term Mechanical Turk as it refers to Amazon's crowd sourcing Internet marketplace. Mechanical Turk in this context is a service with which clients can accomplish tasks that are simple for humans but difficult for computers (termed Human Intelligence Tasks). For example, humans can easily recognize whether or not a picture contains a person, but computers have much more difficulty. Requesters can submit tasks to Mechanical Turk, and workers are paid a low per-unit bounty for completing the tasks. Examples of HITS on the site are podcast transcribing and image tagging. While paying humans small amounts for performing simple tasks may scale in some domains, in other applications the model is unsustainable. One example where the process does not work is book digitization.

Google embarked on the massively ambitious goal of digitizing all the world's books in 2004. When approaching such a monolithic task, developing new, specialized tools is often a good investment. And invest Google did. They poured resources into optical character recognition (the software that converts images into text), scanning techniques, and an interface by which the books could be accessed. Google's scanners can "read" 1000 pages an hour. This is simply not possible with almost any reasonable number of human workers. As of 2010, the company estimates that they have digitized over 10% of the world's books.


Allow me one last piece of background. I swear it's worth it. In your travels over the Internets, you have certainly encountered a form requesting that you type the distorted letters that you see in a picture. These simple tests, called captchas, are designed to be easy for humans and difficult for computers. Using captchas prevents malicious software from signing up for millions of free email accounts (the ultimate triumph... just think about how much email you could send). Captchas are tests that computers can generate and know the answer to but cannot solve themselves.

And... the punch line. Even after all of the investment and research devoted to optical character recognition, the software still stumbles in difficult cases such as smudged or distorted words. Enter reCAPTCHA. Acquired by Google in 2009, reCAPTCHA is a system that uses captchas to digitize the parts of books that are too difficult for computers to read. The system is genius in its simplicity. Instead of displaying an image of one word distorted by a computer, the system shows an image of two distorted words side by side--one of them a regular captcha and one a difficult to read word from the expanse of the Google Books project. Because the two types of images are indistinguishable, the user correctly translates both, unwittingly contributing to the digitization of the world's literature. The result of the captcha is transported back to the original source and inserted in the place of the smudged word in the digital version of the book.

Example of a captcha

reCAPTCHA is a runaway success. It digitizes over 200 million words per day. Most people are unaware that they are effortlessly accomplishing useful work while they go about their daily lives. I'm sure the conspiracy theorist in you is frantically searching for what other work you're being tricked into performing. Tell him or her to go back to contemplating the Kennedy assassination and the moon landing. That way, the innovator in you can start asking what other difficult problems could be solved with simple, elegant, powerful systems like reCAPTCHA.

3 comments:

  1. How does the system perform the same purpose as a standard CAPTCHA if it doesn't know the correct answer ahead of time?

    ReplyDelete
  2. Good question. I should have addressed that. One of the words is known to the computer, and the other is from a scanned book. In theory, the user can't tell which one is which. If they get one right, the system assumes that they got the other one right. In reality, if you know what is going on, you can sometimes tell which one came from a book scan. In that case if you get the book word wrong, the system allows you to pass the captcha test (I have done this just to see if it works. It does.) This would introduce false information into the book digitization system, but each word is given to several users. The chances of users maliciously colluding to enter a false word into the system is vanishingly small.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete