Research > Computer Vision > RGB-D: Techniques and usages for Kinect style depth cameras

RGB-D: Techniques and usages for Kinect style depth cameras A RGBD Project

The RGB-D project is a joint research effort between Intel Labs Seattle and the University of Washington Department of Computer Science & Engineering. The goal of this project is to develop techniques that enable future use cases of depth cameras. Using the Primesense* depth cameras underlying the Kinect * technology, we've been working on areas ranging from 3D modeling of indoor environments to interactive projection systems and object recognition to robotic manipulation and interaction.

Below, you find a list of videos illustrating our work. More detailed technical background can be found in our research areas and at the UW Robotics and State Estimation Lab. Enjoy!

3D Modeling of Indoor Environments

Depth cameras provide a stream of color images and depth per pixel. They can be used to generate 3D maps of indoor environments. Here we show some 3D maps built in our lab. What you see is not the raw data collected by the camera, but a walk through the model generated by our mapping technique. The maps are not complete; they were generated by simply carrying a depth camera through the lab and aligning the data into a globally consistent model using statistical estimation techniques.

3D indoor models could be used to automatically generate architectural drawings, allow virtual flythroughs for real estate, or remodeling and furniture shopping by inserting 3D furniture models into the map.

Interactive Flythrough

Here we show interactive navigation through a 3D model. The visualization can be done in stereoscopic 3D using shutter glasses (just like in the movie Avatar). The system uses a depth camera to control the navigation.

3D Mapping

This video demonstrates the mapping process. Shown is a top view of the 3D map generated by walking with the depth camera through the lab. The system automatically estimates the motion of the camera and detects loop closures, which help it to globally align the camera frames. No external information or sensor is used.

Interactive Mapping

We want to enable novel users to build 3D maps with depth cameras. Here you see our interactive mapping system. The system processes the depth camera data in real time and warns the user when the collected data is not suitable for a map. The approach also suggests areas that have not yet been modeled appropriately.

Interactive Projection Systems (OASIS: Object Aware Situated Interaction System)

Depth cameras can be combined with micro-projectors to generate smart, interactive surfaces in many locations. OASIS is a software architecture that enables us to prototype applications that use depth cameras and underlying computer vision algorithms to recognize and track objects and gestures, combined with interactive projection. Here we show OASIS in a kitchen and Lego playing scenario.

Object recognition underlying OASIS

Depth cameras can also be used to enable computers to recognize objects. Our approach combines depth and color information to recognize different objects. Novel objects can be trained on-the-fly and recognized afterwards.

Robotic Manipulation and Interaction

Depth cameras have numerous applications in robotics, including 3D mapping, navigation, human-robot interaction, and object manipulation. So far, we've been focusing on using depth cameras for object modeling and manipulation. The 3D mapping techniques described above could be readily used on a mobile robot to generate maps of indoor environments.

Chess Playing Robot Gambit

Gambit is a small scale manipulator that interacts with people; here, it challenges them to a game of chess. Gambit uses a depth camera to detect moves performed by a person and automatically executes and announces its own moves. Is uses a camera in its gripper to learn and recognize individual chess pieces.

3D Object Modeling

Our robot Marvin grasps an unknown object and builds a 3D model of it. It automatically decides when and how to re-grasp the object in order to get a complete model. The goal of this research is to enable robots to autonomously explore objects in their environments and learn models of them.

This video shows how Marvin generates a 3D model of an object it is moving in its manipulator. From left to right: Color image of the depth camera, tracking of the manipulator, 3D model of object.

We hope you enjoyed some of these examples. Please come back, this page will be updated frequently.

Your RGB-D team.


"A Large-Scale Hierarchical Multi-View RGB-D Object Dataset," Kevin Lai, Liefeng Bo, Xiaofeng Ren and Dieter Fox. IEEE International Conference on Robotics and Automation , 2011. [PDF]

"Sparse Distance Learning for Object Recognition Combining RGB and Depth Information," Kevin Lai, Liefeng Bo, Xiaofeng Ren and Dieter Fox. IEEE International Conference on Robotics and Automation , 2011. [PDF]

"Toward Object Discovery and Modeling via 3-D Scene Comparison," Evan Herbst, Xiaofeng Ren and Dieter Fox. IEEE International Conference on Robotics and Automation , 2011. [PDF]