Skip to main content

Accelerating Vision-based Indoor Localization by Distributing Image Processing over Space and Time

11 November 2014

New Image

In a vision-based 3D indoor localization system, high frame rate localization of users device is important to support real-time augment reality applications. However, visionbased 3D localization typically involves 2D keypoint detection and 2D-3D matching processes which are in general too computationally intensive to be carried out with high frame rate (e.g., 30 fps) on commodity hardware devices such as laptops or smartphones. In order to reduce per-frame computation time for 3D localization, we present a method that distributes required computation over space and time by splitting a video frame region into multiple sub-blocks, and processing each subblock in a rotating sequence at each video frame. The proposed method is general enough that it can be applied to any keypoint detection and 2D-3D matching schemes. We apply the method in a prototype 3D indoor localization system, and evaluate its performance in a 100-meter long indoor building environment using 5,200 video frames of 640-480 (VGA) size and a laptop with 2.6 GHz CPU. When SIFT-based keypoint detection is used, our method reduces average and maximum computation time per frame by a factor of 10 and 7 respectively, with marginal increase of positioning error (e.g., 0.17 meter), which enables the frame processing rate to increase from 3.2 fps to 23.3 fps.