Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Saurabh Gupta; Ross Girshick; Pablo Arbeláez; Jitendra Malik

doi:10.48550/arxiv.1407.5736

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

dc.contributor.author	Saurabh Gupta
dc.contributor.author	Ross Girshick
dc.contributor.author	Pablo Arbeláez
dc.contributor.author	Jitendra Malik
dc.coverage.spatial	Bolivia
dc.date.accessioned	2026-03-22T20:43:17Z
dc.date.available	2026-03-22T20:43:17Z
dc.date.issued	2014
dc.description	Citaciones: 4
dc.description.abstract	In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.
dc.identifier.doi	10.48550/arxiv.1407.5736
dc.identifier.uri	https://doi.org/10.48550/arxiv.1407.5736
dc.identifier.uri	https://andeanlibrary.org/handle/123456789/83680
dc.language.iso	en
dc.publisher	Cornell University
dc.relation.ispartof	arXiv (Cornell University)
dc.source	University of California System
dc.subject	Artificial intelligence
dc.subject	Computer vision
dc.subject	Computer science
dc.subject	Object detection
dc.subject	Pixel
dc.subject	Segmentation
dc.subject	Convolutional neural network
dc.subject	Embedding
dc.subject	Object (grammar)
dc.subject	Pattern recognition (psychology)
dc.title	Learning Rich Features from RGB-D Images for Object Detection and Segmentation
dc.type	preprint

Collections

Artículo Científico (Preprint)

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Files

Collections