13564816965_19f9c9286e

Larry Hardesty of MIT News Office reports, “With the commodification of digital cameras, digital video has become so easy to produce that human beings can have trouble keeping up with it. Among the tools that computer scientists are developing to make the profusion of video more useful are algorithms for activity recognition — or determining what the people on camera are doing when. At the Conference on Computer Vision and Pattern Recognition in June, Hamed Pirsiavash, a postdoc at MIT, and his former thesis advisor, Deva Ramanan of the University of California at Irvine, will present a new activity-recognition algorithm that has several advantages over its predecessors.”

 

Hardesty continues, “One is that the algorithm’s execution time scales linearly with the size of the video file it’s searching. That means that if one file is 10 times the size of another, the new algorithm will take 10 times as long to search it — not 1,000 times as long, as some earlier algorithms would. Another is that the algorithm is able to make good guesses about partially completed actions, so it can handle streaming video. Partway through an action, it will issue a probability that the action is of the type that it’s looking for. It may revise that probability as the video continues, but it doesn’t have to wait until the action is complete to assess it.”

 

“Finally,” he goes on, “the amount of memory the algorithm requires is fixed, regardless of how many frames of video it’s already reviewed. That means that, unlike many of its predecessors, it can handle video streams of any length (or files of any size)… Enabling all of these advances is the appropriation of a type of algorithm used in natural language processing, the computer science discipline that seeks techniques for interpreting sentences written in natural language.”

 

Read more here.

 

Image: Courtesy Flickr/ Sigfrid Lundberg