New computer vision tool wins prize for social impact

A team of computer scientists at the University of Massachusetts Amherst working on two different problems — how to quickly detect damaged buildings in crisis zones and how to accurately estimate the size of bird flocks — recently announced an AI framework that can do both. The framework, called DISCount, blends the speed and massive data-crunching power of artificial intelligence with the reliability of human analysis to quickly deliver reliable estimates that can quickly pinpoint and count specific features from very large collections of images. The research, published by the Association for the Advancement of Artificial Intelligence, has been recognized by that association with an award for the best paper on AI for social impact.

“DISCount came together as two very different applications,” says Subhransu Maji, associate professor of information and computer sciences at UMass Amherst and one of the paper’s authors. “Through UMass Amherst’s Center for Data Science, we have been working with the Red Cross for years in helping them to build a computer vision tool that could accurately count buildings damaged during events like earthquakes or wars. At the same time, we were helping ornithologists at Colorado State University and the University of Oklahoma interested in using weather radar data to get accurate estimates of the size of bird flocks.”

Maji and his co-authors, lead author Gustavo Pérez, who completed this research as part of his doctoral training at UMass Amherst, and Dan Sheldon, associate professor of information and computer sciences at UMass Amherst, thought they could solve the damaged-buildings-and-bird-flock problems with computer vision, a type of AI that can scan enormous archives of images in search of something particular — a bird, a rubble pile — and count it.

But the team was running into the same roadblocks on each project: “the standard computer visions models were not accurate enough,” says Pérez. “We wanted to build automated tools that could be used by non-AI experts, but which could provide a higher degree of reliability.”

The answer, says Sheldon, was to fundamentally rethink the typical approaches to solving counting problems.

“Typically, you either have humans do time-intensive and accurate hand-counts of a very small data set, or you have computer vision run less-accurate automated counts of enormous data sets,” Sheldon says. “We thought: why not do both?”

DISCount is a framework that can work with any already existing AI computer vision model. It works by using the AI to analyze the very large data sets — say, all the images taken of a particular region in a decade — to determine which particular smaller set of data a human researcher should look at. This smaller set could, for example, be all the images from a few critical days that the computer vision model has determined best show the extent of building damage in that region. The human researcher could then hand-count the damaged buildings from the much smaller set of images and the algorithm will use them to extrapolate the number of buildings affected across the entire region. Finally, DISCount will estimate how accurate the human-derived estimate is.

“DISCount works significantly better than random sampling for the tasks we considered,” says Pérez. “And part of the beauty of our framework is that it is compatible with any computer-vision model, which lets the researcher select the best AI approach for their needs. Because it also gives a confidence interval, it gives researchers the ability to make informed judgments about how good their estimates are.”

“In retrospect, we had a relatively simple idea,” says Sheldon. “But that small mental shift — that we didn’t have to choose between human and artificial intelligence, has let us build a tool that is faster, more comprehensive, and more reliable than either approach alone.”