CrossLink: Joint Understanding of Image and 3D Model Collections through Shape and Camera Pose Variations

1University College London     2École Polytechnique

SIGGRAPH Asia 2015

We present a scalable analysis framework that jointly investigates class-labeled image and noisy 3D model collections (‘couch’ class in this example) to organize them by filtering outliers and factoring out shape and camera pose variations. The input images are consistently resorted along extracted view- and attribute-axes; while the 3D models are automatically reordered and consistently co-aligned.


Abstract

Collections of images and 3D models hide in them many interesting aspects of our surroundings. Significant efforts have been devoted to organize and explore such data repositories. Most such efforts, however, process the two data modalities separately, and do not take full advantage of the complementary information that exist in different domains, which can help to solve difficult problems in one by exploiting the structure in the other. Beyond obvious difference in data representations, a key difficulty in such joint analysis lies in the significant variability in the structure and inherent properties of the 2D and 3D data collections, which hinders cross-domain analysis and exploration. We introduce CrossLink, a system for joint image-3D model processing that uses the complementary strengths of each data modality to facilite analysis and exploration. We first show how our system significantly improves the quality of text-based 3D model search by using side information coming from an image database. We then demonstrate how to consistently align the filtered 3D model collections, and then use them to re-sort image collections based on pose and shape attributes. We evaluate, both quantitatively and qualitatively, our framework on 20 object categories of 2D image and 3D model collections, and quantitatively demonstrate how a wide variety of tasks in each data modality can strongly benefit from the complementary information present in the other, paving the way to a richer 2D and 3D processing toolbox.


Video
Bibtex
@article{HuetingEtAl:Crosslink:2015,
  title   = {CrossLink: Joint Understanding of Image and 3D Model Collections through
             Shape and Camera Pose Variations},
  author  = {Moos Hueting and Maks Ovsjanikov and Niloy Mitra},
  year    = {2015},
  journal = {{ACM SIGGRAPH Asia 2015}}
}
        
Acknowledgements

We thank the reviewers for their comments and suggestions for improving the paper. Special thanks to Melinos Averkiou for helping to compare with mesh based co-alignment algorithms and Ersin Yumer for providing 3D data. The authors are grateful to Leonidas Guibas, Thomas Funkhouser, John Shawe-Taylor, Aron Monszpart, James Hennessey, Clément Godard and Peter Hedman for helpful discussion and encouragement. This work was supported in part by ERC Starting Grant SmartGeometry (StG-2013-335373), Microsoft Research through its PhD Scholarship Programme, Marie Curie CIG, Marie-Curie CIG-334283-HRGP, a CNRS chaire d'excellence, and chaire Jean Marjoulet from École polytechnique.

Links

Paper (34MB)

Code (8.9MB)

Video (Download)