Visual Motion (Dynamics)
Structure from motion
Joint estimation of 3D structure, motion and appearance
Interacting with a complex, unknown, dynamic environment requires continuously updated knowledge of its shape and motion. We propose several algorithms aimed at inferring shape, motion and appearance causally and incrementally.
Ambiguities and optimality in 3D motion estimation
Estimating 3D structure and motion can be cast as a non-linear, high-dimensional optimization problem, prone to local minima. Such local minima are intrinsic to the problem, and not the algorithm or computational device used to solve it, and are therefore true illusions. Can we identify, analyze, categorize such illusions, and devise optimal algorithms to infer the global estimate when possible?
Real-time, vision-based navigation and interaction
Vision is a remote, distributed, passive sensor crucial for primates to move within the environment. While successful application of vision in the loop of a control system has been demonstrated under partially controlled conditions (freeway guidance, spacecraft landing), we tackle navigation and interaction within unknown and dynamic environments by building representations that can be used for localization, mapping and navigation.
Deformotion: deforming motion, shape averages and the joint segmentation and registration of images
How can we capture the "overall motion" for adeforming object? How can we "separate" the overall motion from the deformation? How do we characterize what is "conserved" during motion? We propose a framework for modeling deforming motion that entails defining a "moving average shape" and that allows for the simultaneous registration and matching of images and for tracking deformable objects.
Variational optical flow estimation and segmentation
We segment videos into domains of homogeneous motion by minimizing an appropriate cost functionals. Our method allows tracking moving objects in video sequences, reconstructing the different depth layers of a 3D scene filmed by a moving camera and segmenting motion patterns which cannot be distinguished based on their appearance.
Dynamic Textures: Modeling and Synthesis
Dynamic textures are sequences of images of scenes that exhibit some form of temporal and possibly spatial stationarity, such as fire, smoke, steam, foliage etc. Models of dynamic textures can be used to generate novel synthetic sequences and manipulate real ones.
Dynamic Texture Recognition
How do we distinguish fog from steam? Models of dynamic textures can be used to discriminate visual processes based on their spatial as well as temporal statistics.
Dynamic Texture Segmentation
How do we detect the presence of smoke, and identify where in the image it appears?
Human gaits: modeling and recognition
At a fairly high level of abstraction, a human moving about can be represented as a dynamical system, driven by intentions (actions), and outputting actuator forces, resulting in joint trajectories. We study how one can infer actions from remote measurements of joint angles or trajectories. Ultimately we want to be able to identify an action regardless of the particular individual, and to identify the individual regardless of the action. Preliminary results show that simple dynamical models allow for successful classification of action classes, such as walking gaits.
Our goal in this project is to build synthetic models of human faces that can be driven by a speech signal, while retaining the distinctive features of a particular individual.
Modeling and representation
Shape Representation via Harmonic Embedding
Is it possible to define a flexible representation of shape that is linear, so that the sum of two shapes is a shape, and operations like differentiation, averaging and orthogonal projection make sense? We represent the shape of closed planar contours as the zero level set of functions that satisfy certain partial differential equations, so that they are (quasi) linear by construction.
Integral Invariants for Shape Matching and Recognition
Planar contours can be easily recognized despite being presented under various transformations, such as scaling, translation, projective transformations, in addition to being subjected to measurement noise. Is it possible to define a signature that is invariant with respect to such transformations, and at the same time insensitive to noise?
Shape priors in level set segmentation
By introducing prior knowledge on the shape of objects of interest, one can drastically improve the robustness of segmentation processes to noise, background clutter and partial occlusion. We investigate methods to integrate such priors into level set based segmentation schemes. By minimizing an appropriate cost functional we simultaneously generate a knowledge-driven segmentation of the input image and a decision about where to apply which prior. As a result we can simultaneously reconstruct multiple familiar objects in a given image.
Variational Shape Matching
We develop variational techniques for matching closed planar contours without distinct landmark points.
Certain objects elicit perceptual responses: a face can appear attractive or friendly, a car can appear aggressive or comfortable, etc. Since such objects are characterized by their shape (and to a lesser extent by their radiance), there must be some form of "map" between geometry and qualitative perception. How is this map represented? How can it be inferred? Can it be inverted, so as to allow purposeful changes in geometry to achieve a desired perceptual response?
Multiple view geometry
Structure From Motion: From 2D images to 3D geometric models
Through most of the past decade we have been engaged in the study of the geometry of multiple views, which plays a key role in the reconstruction of the 3D structure of the scene, the motion and calibration of the camera.
Multi-body Motion Estimation and Segmentation
Given a sequence of images of a scene containing multiple rigid objects moving independently, one can estimate the number of objects, the motion of each object, and what portion of the visual field corresponds to what object using algebraic techniques.
T-junctions and Occlusions
Occluding boundaries are visually salient because they often result in discontinuities in image intensity. T-junctions arise when a curve terminates at an occluding boundary (forming a "T"). Unfortunately, T-junctions do not correspond to physical points on the scene, as they move with the viewpoint. Nevertheless, we show that the motion of T-junctions on the image plane contains information about the scene that can be exploited for reconstruction.
Visual Reconstruction (Photometry)
Radiance and shape estimation
Tales of Shape and Radiance in Multi-View Stereo
Traditional stereo relies on the "brightness constancy" assumption to establish correspondence between points in different images. This allows "eliminating" photometry from the equation and reduces stereo reconstruction to a purely geometric problem. However, when the brightness assumption is not satisfied, one cannot "separate" the reconstruction of shape from the reconstruction of reflectance. We show under what condition such separation yields optimal algorithms. The cost functional can be integrated either in the image, or on the scene surface where the image back-projects. When integrating on the scene, the optimality conditions involve derivatives of the (noise-ridden, measured) images. However, when integrating on the image, the optimality conditions only involve derivatives of the (noiseless, ideal) model. Therefore, one can devise infinite-dimensional gradient-based reconstruction algorithms that do not involve derivatives of the data, with obvious improvement in robustness.
Multi-view stereo beyond Lambert
Traditional stereo relies on establishing correspondence between points in different images. Unfortunately, such correspondence cannot be established unless the scene is made of dull matte objects, for instance with shiny, specular, or translucent materials. We propose a novel approach that relies on matching image to image, but on matching each image to an underlying model of the geometry (shape)photometry (radiance tensor field) of the scene. Discrepancy from the model is measured by the deviation from the ideal rank of the radiance tensor field; we develop optimal algorithms to infer shape and radiance from collections of images, based on variational techniques and level set methods to integrate partial differential equations.
Stereoscopic Segmentation with Constant Albedo Statistics
When a scene contains no "features" (constant albedo) or too many features (dense self-similar texture), traditional stereo matching algorithms fail to find proper "correspondence." We therefore seek to match image to image, but instead match all data to an underlying model of the scene geometry and its photometry, subject to the assumption of constant albedo.
Stereoscopic Segmentation with Smooth Albedo Statistics
Even when an object has constant albedo, the measured irradiance is not, because of shading and other effects. While one could model this effect explicitly (see Stereoscopic Shading project), if illumination is static one can assume that it is the albedo that is smooth, and exploit this assumption to recover shape and albedo.
Piecewise Constant/Smooth Albedo, or Region-Based Segmentation on Manifolds
Many real objects (especially man-made) are made by composing different materials, and therefore they have piecewise constant reflectance properties. We have developed algorithms for estimating the shape, albedo, and albedo boundaries from collections of images. The process involves performing region-based segmentation on evolving surfaces.
Simultaneous Segmentation and Registration
When neither motion, nor shape nor albedo are known, under suitable conditions one can simultaneously estimate shape and camera pose by jointly registering various "regions" of the scene.
Illumination and reflectance
Smooth objects with constant albedo result in smooth measured images due to non-uniform illumination. We develop techniques to estimate shape, albedo and illumination properties of the scene under the assumption of constant albedo and finite point light sources.
Observability of Shape from Defocused Images
It is well-known that blur conveys spatial information. However, to what extent does it? Can one characterize the set of shapes that are indistinguishable from any number of defocused images? Since the answer depends on the radiance of the scene, do there exist radiances (e.g. structured light patterns) that allow reconstructing any shape? We present a mathematical analysis of the observability properties of shape from defocus. We also present novel techniques to reconstruct shape and radiance.
Optimal Estimation (L2) of 3D Shape and Radiance from Blurred Images
Under the conditions for which one can reconstruct shape from defocused images, we develop inference algorithms that are optimal in the sense of least-squares. By exploiting the properties of semi-infinite orthogonal projectors in Hilbert spaces we can transform an infinite-plus-one-dimensional optimization problem into a much more efficient (regularized) one-dimensional optimization, with obvious consequences to computational efficiency.
Optimal Estimation (I-divergence) of 3D Shape and Radiance
We develop efficient algorithms for reconstructing 3D shape and radiance from blurred images that are optimal in the sense of relative entropy. The algorithms consist of evolving a surface from an initial point towards a (local) minimum of an energy functional, via the numerical integration of a suitable partial differential equation.
Learning Shape from Defocus
Images depend on the shape of the scene, its radiance, as well as the optical characteristics of the imaging device. In this work we show that one can learn the optical characteristics from data. Our approach is robust to the point where one can learn the optical characteristic of a "virtual" camera using synthetic training data, and apply the results to real cameras in order to reconstruct the shape of real scenes.
Diffusion-based Shape from Defocus
Estimating shape and radiance from blurred images is well-known to be a severely ill-posed inverse problem. In this work we propose an efficient solution via the forward solution of a diffusive partial differential equation with a space-varying stopping time. This allows us to have a well-behaved, straightforward numerical algorithm that has proven robust and efficient.
Motion-blur: Estimation, Segmentation, Restoration
Since images are captured by integrating photon count over an interval of time (exposure), moving objects appear blurred in ways that depend upon their shape, motion and reflectance. We propose a collection of algorithms to estimate shape and motion of moving objects from one single blurred image.
Visual features for correspondence
Visual Features for Correspondence
How can we decide whether two images portray the same scene? What is the scene? How is it related to the image? Are there representations that are invariant with respect to nuisance factors (viewpoint, illumination)? Are there image statistics ("features") that do not alter decision performance?
Filtering, control and identification
Filtering and Identification of Hybrid (Jump-Linear) Systems
Given a process that exhibits complex dynamic behavior, one can choose to model it globally with a very complex model, or to choose a simple class of models and represent the process locally, together with the partition of the data into neighborhoods. We explore the problem of identifying simple local model and their domain for dynamic processes.
Particle Filtering on Lie Groups and Homogeneous Spaces
Particle filters are flexible algorithms to propagate the conditional density of a dynamical model, represented weakly as a collection of samples drawn from it. We explore particle algorithms for dynamical models whose state space has a non-trivial geometric structure, such as a Lie group or a homogeneous space.
Trajectory Tracking and Motion Control
We are interested in controlling a non-holonomic robot as to follow a prescribed trajectory with guaranteed performance. We propose an algorithm inspired by model-based predictive control that involves controlling the local approximation of the trajectory to be tracked, computed in real-time.
Signal Processing for Retinal Implants
We explore the use of various signal processing algorithms to enhance the perception capabilities of patients with retinal implants.
DARPA Grand Challenge
DARPA Grand Challenge 2005
The UCLA Vision Lab is engaged in the DARPA Grand Challenge as part of the Golem Group/UCLA team.
Center for Computational Biology
Center for Computational Biology
The convergence of the biomedical revolution and the information technology revolution is a major event in the history of science. The emerging discipline of Computational Biology is a natural result of this convergence. The mathematical and computational sciences lie at the center of this new endeavor, providing the tools and framework for model building and quantitative analysis.
The Center for Computational Biology (CCB) was established to develop, implement and test computational biology methods that are applicable across spatial scales and biological systems. Our objective is to help elucidate characteristics and relationships that would otherwise be impossible to detect and measure.
Interactions fostered by this multi-disciplinary scientific network will spawn novel strategies and will initiate training opportunities for the next generation of relevant and promising biological endeavors.
Active Vision Control System
Active Vision Control System for Complex Adversarial 3-D Environments
The Active-Vision Control Systems MURI is a joint effort sponsored by the Air Force Office of Scientific Research.
CoMotion: Computational Methods for Collaborative Motion
The MURI Project includes students, faculty and staff from Stanford University, UC Berkeley and UCLA. The aim of the project is to develop computational methods for the simulation of collaborative motion of autonomous vehicles. The multi-disciplinary team consists of faculty and researchers from applied mathematics, statistics, computer science, electrical engineering and aeronautical engineering who combine their expertise to derive practical control algorithms for groups of collaborating vehicles. (Please follow the links to each of the faculty members to obtain their publications and presentations).