Program helps boost computer sketch recognition

DRAWING BOARD: James Hays, assistant professor of computer science at Brown University, demonstrates different aspects of his sketch-based recognition program. / PBN PHOTO/NATALJA KENT
DRAWING BOARD: James Hays, assistant professor of computer science at Brown University, demonstrates different aspects of his sketch-based recognition program. / PBN PHOTO/NATALJA KENT

Brown University assistant professor James Hays knew computers have a good eye for art, especially detecting texture and shading in life-like representations of real-world images.
But Hays and fellow researchers wanted to find out how artificial intelligence reacted to more rudimentary art, the kind of sketches amateurs might dash off on a napkin or notebook.
Could a computer identify the abstract shapes and outlines commonly used across cultures to represent sharks, cars or zebras? Could a computer play the party-game “Pictionary”?
To their surprise, Hays and two visiting researchers from Berlin found they could teach a computer to recognize our cruder visual renderings, if not quite as well as other humans can.
Thanks to the emerging capabilities of crowdsourcing, Hays and colleagues created a program that correctly identified the subject of a simple sketch more than half the time. A random selection would have gotten the identification right less than 1 percent of the time and humans generally get it right three-quarters of the time.
“I was positively surprised,” Hays said. “Part of the reason that there are a lot of recognition problems in computer vision is that the part they are good at is texture. A computer is worse at things defined by shape, with no texture and just outline.”
And simple, abstract outlines are the most common ways people, especially those without great artistic talent, render images when trying to communicate through drawings.
So the capacity for computers to recognize simple, iconographic human drawings could have vast uses, such as pictorial Internet searching.
“Sketch-based search is an existing field, but now that we don’t just rely on matching sketches and photographs, it might be better,” Hays said, “allowing children, maybe people who are not literate or don’t speak the right language, to interact with a computer.”
Hays’ background is in computer graphics and vision and it was only in discussions with the visiting researchers from the Technical University in Berlin, Mathias Eitz and Marc Alexa, that the three began thinking about sketch recognition.
Software already existed that allowed computers to match images, such as photographs or professional police drawings, but no one had tried to teach a computer to recognize hand-drawn visual symbols, such as four legs and stripes representing a zebra, on a conceptual level. Crowdsourcing would make it possible.
“All of the research in the past has had the assumption that you can draw something well,” Hays said. “The key novelty of this research was that we actually went out and collected a database through crowdsourcing of the kinds of images and icons most people can make.”
In data-hungry fields of computer learning, crowdsourcing has taken off to provide the information software needs to learn difficult tasks.
For their experiment, Hays’ team used the online-work marketplace Amazon Mechanical Turk, a crowdsourcing platform from the e-commerce giant that posts tasks for Internet users to perform remotely around the world.
The team posted a Web interface on Mechanical Turk and asked workers to draw different kinds of figures at 2 cents a sketch.
They got responses from roughly 500 “artists” who sent in 20,000 sketches, enough to feed to the computer and train it to recognize the visual patterns that humans develop through culture.
Altogether, collecting the sketches cost the Brown team less than $500, much less than it would have cost to do the same thing less efficiently in the pre-crowdsourcing era.
After training the computer program by exposing it to the 20,000 crowd-sourced drawings, the team then tested the system and found it correctly identified an image 55 percent of the time, compared with an average 77 percent success rate for humans exposed to the same images.
Beyond nonlanguage search, Hays said the program his team created could help develop new computer games. And the fact that we know exactly how much better humans recognize sketches than computers could have implications for the “CAPTCHA” human-recognition systems used to prevent spam.
And Hays said the potential for crowdsource-aided research is even greater, reaching such areas as face recognition, audio recognition, neuroscience and automobile sensors for identifying objects on the road.
“Anything that needs training data,” Hays said. “Now you are seeing crowdsourced human psychology. There is some pushback on collecting data that way, but it is happening.” •

No posts to display