Multiple View Descriptors
We propose an extension of popular descriptors based on gradient orientation histograms (HOG, computed in a single image) to multiple views. It hinges on interpreting HOG as a conditional density in the space of sampled images, where the effects of nuisance factors such as viewpoint and illumination are marginalized. However, such marginalization is performed with respect to a very coarse approximation of the underlying distribution. Our extension leverages on the fact that multiple views of the same scene allow separating intrinsic from nuisance variability, and thus afford better marginalization of the latter. The result is a descriptor that has the same complexity of singleview HOG, and can be compared in the same manner, but exploits multiple views to better trade off insensitivity to nuisance variability with specificity to intrinsic variability. We also introduce a novel multi-view wide-baseline matching dataset, consisting of a mixture of real and synthetic objects with ground truthed camera motion and dense three-dimensional geometry.
First we re-interpret a gradient orientation histogram descriptor such as SIFT or HOG as a class-conditional density.
Then we prove that the above is true only when the scene is planar, fronto-parallel to the image plane and move parallelly.
The limitation comes from using a single image.
Given multiple views, we propose a sampling-based approximation to the class-conditional density, MV-HOG and a point-estimation based approximation, R-HOG.
MV-HOG aggregates histograms of gradient orientations from samples returned by a tracker.
R-HOG aggregates histograms of gradient orientations from samples generated by a reconstructed 3D model which in term can be reconstructed from multiple views or a RGBD sensor.
Empirical comparison shows that the proposed methods achieve the state of the art performance.
- J. Dong, N. Karianakis, D. Davis, J. Hernandez, J. Balzer and S. Soatto. Multi-View Feature Engineering and Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [pdf]