RAID: A Relation-Augmented Image Descriptor

1KAUST     2University College London

SIGGRAPH 2016

We propose a novel descriptor called RAID to describe the spatial relationship between image regions. This descriptor enables retrieval with queries based on complex relationships between regions, such as the ‘riding’ relationship between the orange source and the blue target region. In this example, the user sketched two regions (left) and RAID retrieved images as shown (right).

Video

Abstract

As humans, we regularly interpret scenes based on how objects are related, rather than based on the objects themselves. For example, we see a person riding an object X or a plank bridging two objects. Current methods provide limited support to search for content based on such relations. We present RAID, a relation-augmented image descriptor that supports queries based on inter-region relations. The key idea of our descriptor is to encode region-to-region relations as the spatial distribution of point-to-region relationships between two image regions. RAID allows sketch-based retrieval and requires minimal training data, thus making it suited even for querying uncommon relations. We evaluate the proposed descriptor by querying into large image databases and successfully extract nontrivial images demonstrating complex inter-region relations, which are easily missed or erroneously classified by existing methods. We assess the robustness of RAID on multiple datasets even when the region segmentation is computed automatically or very noisy.


This project has been covered in phys.org and was highlighted in the ACM TechNews.

Bibtex
@article{GuerreroEtAl:RAID:2016,
  title   = {{RAID}: A Relation-Augmented Image Descriptor}, 
  author  = {Paul Guerrero and Niloy J. Mitra and Peter Wonka},
  year    = {2016},
  journal = {ACM Trans. Graph.},
  volume = {35},
  number = {4},
  issn = {0730-0301},
  pages = {46:1--46:12},
  numpages = {12},
  doi = {10.1145/2897824.2925939},
}
      
Acknowledgements

We thank the participants of our user study, the anonymous reviewers for their comments and constructive suggestions and Shuai Zheng for giving us early access to the semantic segmentation code. The research described here was supported by the Office of Sponsored Research (OSR) under Award No. OCRF-2014-CGR3-62140401, the Visual Computing Center at KAUST, ERC Starting Grant SmartGeometry (StG-2013-335373), Marie Curie CIG 303541 and the Open3D Project (EPSRC Grant EP/M013685/1).

Links

Paper (14.4MB)

Low-res Paper (1.95MB)

Supplementary Material (6.99MB)

Data (20.9GB)

Code (Bitbucket)

Slides (165MB)