The University of Arizona

Persistent Stare through Imagination


About the Grant

The Persistent Stare through Imagination (PSI) project is supported by the Defense Advanced Research Projects Agency (DARPA) as part of its Mind's Eye program. The goal of Mind's Eye is to build a camera that can tell us what it sees. DARPA is interested in this problem because the cost of surveillance teams is very high, as is the cost of monitoring "dumb" remote cameras. A "smart" camera ought to be able to report suspicious activity. Mind's Eye is a particularly interesting problem because it merges computer vision with machine learning and models of human activities. Our approach involves three levels of inference: At the highest level, there are models of activity and at the lowest, there are vision algorithms optimized for pose recognition and tracking. The innovation is at the middle level, where simulation will generate possible futures a brief instant before they happen in the physical world. Said differently, the approach is to imagine, via simulation, what might be happening in the scene. Imagination can constrain conventional vision processing and should make it more accurate and efficient.


Project Summary

Imagine a middle-aged man carrying a heavy box down a crowded street. Does he weave nimbly between pedestrians, or do they adjust to him? Does he keep up with the flow, or is he slower? Is he enjoying himself? The fact that you can answer these questions means that you know what it's like to carry a heavy box down a crowded street; you've probably done it yourself. The fact that you can envision the man -- that you can see in your mind's eye the box clasped to his chest -- means that you know what such a scene looks like. Our proposal is to bring this kind of knowledge to bear on the problem of persistent stare.

Human visual processing is predictive: The mind anticipates what the eye will see, which probably helps the eye and the mind make sense of ambiguous visual information. The mind also can imagine visual scenes that don't necessarily exist; for example, you can imagine a person crossing a road even if you aren't looking at a person crossing a road.

These functions of the mind's eye -- predictive vision and visual imagination -- suggest simulation as a component of a computational mind's eye. By simulation we mean a "world in the mind" that models the "world out there" to some degree of accuracy.

We will build a system called PSI (for "persistent stare through imagination") whose internal model of a scene will be a simulator of the scene. As models, simulators have some desirable properties. First, simulators represent behavior and thus occupy the middle ground between the semantic domain ("why does it happen?") and the visual domain ("what does it look like?"). Second, simulators are not merely descriptive but are generative, with the added benefit that the behaviors they generate can be projected onto an image plane to both help interpret visual data and help visualize what's happening in the scene. Third, some of the behaviors generated by simulators are emergent, which means that simulators can imagine futures that no engineer can or will build into a priori top-down models.


For questions about this grant contact Paul Cohen at