The sense of vision plays a crucial role in the life of primates, allowing them to retrieve a representation of the environment that is suitable to perform complex control tasks such as moving, tracking, manipulating objects and recognizing them. Our research is driven by the desire to understand and design systems that can interact and communicate with humans within natural, unstructured environments. For engineering systems to exhibit such capabilities, they must be endowed with sensing, processing, communication, control and actuation capabilities. While in biological systems these skills are highly integrated, in engineering they have been traditionally separated: we know how to build sensors, how to build computers, how to build and actuate robots, but we do not know yet how to endow an engineering system with a sense of vision: how can we make a computer recognize a person, approach her, understand her disposition and serve her request? In the future, we envision engineering systems being increasingly about processing sensory information and using it to interact with humans and the environment.


Any imaging device - whether the human eye, a video camera or a telescope - involves a map from the three-dimensional world onto the two-dimensional surface of an imaging sensor. Such a map causes a loss of information along a spatial dimension. The goal of a visual system - whether biological or artificial - is to use images to retrieve a spatial model of the environment. It may seem surprising at first that the "physically correct" (in the Euclidean sense) model of the world cannot be recovered, and therefore one can only seek to infer a "representation" of it. Although the general vision problem is intrinsically ill-posed, it can become well-posed within the context of a specific task. While it is not possible to determine uniquely geometry, photometry and dynamics of a scene, it is possible to determine it to the extent that we can walk through it or reach for it. The field that attempts to integrate sensing and processing within the context of a control task is called "Dynamic Vision".


The UCLA Vision Lab is engaged in a variety of projects involving processing visual information to retrieve a model of the environment for the purpose of control and interaction with the environment and with humans. Dynamic Vision offers the potential for applications that can have a positive social impact by assisting humans in decision/control tasks performed by processing sensory information, such as recognition, classification, navigation, manipulation, tracking etc. In an industrial setting, computational sensing has already resulted in several products that relieve humans from tasks that are repetitive (e.g. detecting imperfections in fabric or manufactured parts), stressful (e.g. security) or dangerous (e.g. maintenance of underwater platforms or power plants). In transportation, several major companies have working prototypes of automatic guidance systems for passenger cars and trucks (although the systems are complete and operational, they are not currently employed due to unresolved legal issues). Naturally, the Military is very sensitive to the potential of computational sensing systems.Additional industries that are increasingly involved in Computational Vision are Entertainment (image-based modeling and rendering, visual insertion, architectural models), Health Care (assised/teleoperated surgery, tomography, imaging, brain mapping), and the Computer Industry (human-computer interfaces). The uncertain, complex and dynamic nature of the physical world and the intrinsic ill-posedness of the general vision problem brings about an unusual combination of mathematical tools from the fields of differential geometry (for the study of shape), dynamical systems (for the study of motion and deformation) and functional analysis (images are functions of the radiance distribution of physical surfaces). Uncertainty is often captured in a probabilistic sense through the use of stochastic processes, and computational perception is posed as a statistical inference or functional optimization problem. 

The UCLA Vision Lab is located in 3811 Boelter Hall, accessible from the inner outdoor corridor on the west side of the building (see a map with directions for parking and access to the lab).