Automatic Semantic Modeling of Indoor Scenes

from Low-quality RGB-D Data using Contextual Information

Kang Chen1, Yu-Kun Lai2, Yu-Xin Wu1, Ralph Martin2, Shi-Min Hu1
1Tsinghua University, Beijing, China
2Cardiff University

Abstract: We present a novel solution to automatic semantic modeling of indoor scenes from a sparse set of low-quality RGB-D images. We exploit the knowledge in a scene database containing hundreds of indoor scenes and over 10,000 objects represented as manually segmented and labeled mesh models. Within a few seconds, we output a visually plausible 3D scene, based on models and parts from the database adapted to fit the input scans. Using low-quality RGBD data is challenging due to noise, low resolution, occlusion and missing depth information. Contextual relationships learned from the 3D database are used to constrain reconstruction, ensuring semantic compatibility between both object models and parts. Small objects and objects with incomplete depth information are difficult to recover reliably. We do so with a two-stage approach: major objects are recognized first to provide a known scene structure. We then apply 2D contour-based model retrieval to recover the smaller objects. An evaluation on our own data and two public RGB-D datasets shows that our approach can model typical real-world indoor scenes efficiently and robustly.

Paper: [PDF 36M].
Supplemental Document: [PDF 57M].
Presentation: [PPTX 6M].
Test RGB-D Data: [RAR 77M].

Bibtex: @article{Chen:2014:ASM:2661229.2661239,
author = {Chen, Kang and Lai, Yu-Kun and Wu, Yu-Xin and Martin, Ralph and Hu, Shi-Min},
title = {Automatic Semantic Modeling of Indoor Scenes from Low-quality RGB-D Data Using Contextual Information},
journal = {ACM Trans. Graph.},
issue_date = {November 2014},
volume = {33},
number = {6},
month = nov,
year = {2014},
issn = {0730-0301},
pages = {208:1--208:12},
articleno = {208},
numpages = {12},
url = {},
doi = {10.1145/2661229.2661239},
acmid = {2661239},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {3D scenes, indoor scenes, model retrieval, part assembly, semantic modeling},