The Soylent Grid is an infrastructure for distributed human labeling - a system researchers use to get people in the loop when performing computational tasks. Until today, some computational tasks still cannot be adequately performed by computers alone, their solutions either take a long time to compute or are not accurate enough to be useful.
Most tasks envisioned for this infrastructure also allow some division of labor - some parts of an algorithm can be performed fast and adequately by a computer, while others need human participation: That's where we're bringing people in the loop.
The valuable information gained through people is building on our skills in perception: Human vision and processing helps to label large amounts of data, which then can be used for tasks like object recognition.
As part of our efforts to help the blind and visually impaired in the GroZi project, we're leveraging the Soylent Grid infrastructure to generate large amounts of training data for a product recognition algorithm.
Two different tasks are presented to each user of the system. Each of these tasks is presented twice: For one of the tasks (we call it "control"), the solution is already known. The other one (we call it "experiment") is an image we need to have labeled. By checking the user-supplied input, we can then determine whether the control task was solved by a human or a computer, a distinction that is important to prevent spam or web spider crawling to protected parts of a website. This type of test is known as a ReCAPTCHA.
In the first task, called Annotation, we ask our users to type the text that is indicated by the green bounding box. This helps us to build a training data set for character recognition.
For the second task, called Detection, we provide the word and expect our users to draw a bounding box around it. This helps us to determine the text location in the image.
If you would like to learn more about the Soylent Grid, please have a look at the publications, follow the links or contact us.
PDF versions of these papers can be found on our group's website.