Skip to main content

3D Semantic Label Transfer in Human-Robot Collaboration

20 July 2021

New Image

We tackle two practical problems in robotic scene understanding: first, the computational requirements of current semantic segmentation algorithms are prohibitive for typical robots; and second, the viewpoints of ground robots are quite different from typical human viewpoints of training datasets and therefore objects are often misclassified from robot viewpoints. We present a modular 3D semantic reconstruction pipeline for multiple agents. We first create a sparse point cloud map for localizing and aligning agents into the same global coordinate system. Next, we create a 3D dense semantic map of the space from human viewpoints. Finally, by re-rendering semantic labels (or depth maps) from the ground robots' own estimated viewpoints and sharing them over the network, we can give semantic understanding (or depth perception) to simple monocular agents.