Collaborative Visual SLAM using Compressed Feature Exchange

IEEE Robotics and Automation Letters
Munich, Germany, January 2019, Preprint LMT, IEEE

Abstract:

In the field of robotics, collaborative Simultaneous Localization and Mapping (SLAM) is still a challenging problem. The exploration of unknown large-scale environments benefits from sharing the work among multiple agents possibly equipped with different abilities, such as aerial or ground-based vehicles. In this letter, we specifically address data-efficiency for the exchange of visual information in a collaborative visual SLAM setup. For efficient data exchange, we extend a compression scheme for local binary features by two additional modes providing support for local features with additional depth information and an inter-view coding mode exploiting the spatial relations between views of a stereo camera system. To demonstrate the coding framework, we use a centralized system architecture based on ORB-SLAM2, where energy constrained agents extract local binary features and send a compressed version over a network to a more powerful agent, which is capable of running several visual SLAM instances in parallel. We exploit the information from other agents by detecting overlap between already mapped areas and subsequent merging of the maps. Henceforth, the participants contribute to a joint representation and benefit from shared map information. We show a reduction in terms of data-rate by 70.8 % for the feature compression and a reduction in absolute trajectory error by 53.7 % using the collaborative mapping strategy on the well-known KITTI dataset. For the benefit of the community, we provide a public version of the source code.

Scenario
heterogeneous robot team

In addition, we propose to use the information from nearby agents by detecting overlap between already mapped areas and subsequent merging of the maps. Henceforth, the participants contribute to a joint global map representation and benefit from shared map information. To this end, we extend a compression scheme for local binary features by exploiting the spatial relations between views obtained by a stereo camera system and implement a collaborative mapping scheme. We demonstrate the effectiveness of both the feature compression approach and the collaborative mapping strategy on the well-known KITTI dataset.

Approach:

The approach is summarized in the following:

Our system is based on ORB-SLAM2 [2] and we evaluated our approach on the KITTI dataset [3] and the Euroc dataset [4]. Our feature coding framework is based on [1] and adds a depth value and a stereo coding mode.

Overview
Proposed improved binary feature coding framework

Video:

Supplementary Material:

Here, we show additional material not included in the paper. First, we show the resulting trajectories of our experiment by splitting KITTI 00 into three disjunct parts. The ground truth is shown alongside the trajectory estimated by our collaborative approach.

Kitti 00-1
KITTI 00-1
Kitti 00-2
KITTI 00-2
Kitti 00-3
KITTI 00-3
Kitti Map
Collaborative Map

Next, we show the evaluation of our feature coding framework on the Euroc dataset V101 sequence. First we evaluate the sequence with the I+S+M+P and one reference frame for the P mode. For the left view, we added the costs for coding the depth information, as used for the monocular version.

Bits per cofing mode left view
a) Bits per coding mode for the left view
Bits per cofing mode right view
b) Bits per coding mode for the right view
Percentage of used coding modes left view
c) Percentage of used coding modes for the left view
Percentage of used coding modes right view
d) Percentage of used coding modes for the right view

Next, we evaluate the sequence with the I+S+M+P and four reference frames for the P mode. Again, the left view has additionally the depth values included.

Bits per cofing mode the left view
a) Bits per coding mode for the left view
Bits per cofing mode the right view
b) Bits per coding mode for the right view
Percentage of used coding modes left view
c) Percentage of used coding modes for the left view
Percentage of used coding modes right view
d) Percentage of used coding modes for the right view

Next, timings are provided for the Euroc dataset V101 sequence:

Stereo I I+M I+P1+S I+P1+S+M I+P4+S+M
ORB 14.5 ms
encoding 11.5 ms 12.8 ms 15.6 ms 16.4 ms 26.3 ms
decoding 12.6 ms 13.3 ms 13.8 ms 13.9 ms 13.5 ms
bits/feature 224.5 208.9 170.4 165.3 151.5
#features 2x1.2k 2x1.2k 2x1.2k 2x1.2k 2x1.2k

Next, we present the results for Mono+Depth

Mono+Depth I+D I+P1+S+D I+P4+S+D
encoding 7.7 ms 9.8 ms 13.4 ms
decoding 8.6 ms 8.9 ms 8.9 ms
bits/feature 230.1 175.8 160.1
#features 1.2k 1.2k 1.2k

When changing from a single to four reference frames, the inter-coding mode is used more frequently (compare figures c+d). The encoding time increases from 16.4 ms to 26.3 ms. The measurements were performed on an Intel i7-7700 CPU @ 3.60GHz, with 1200 features per frame (ORB-SLAM2 default settings).

Citation:

If you refer to our map compression approach in an academic work, please cite:

@article{VanOpdenbosch2019,
author = {{Van Opdenbosch}, Dominik and Steinbach, Eckehard},
journal = {IEEE Robotics and Automation Letters (RAL)},
number = {1},
pages = {57--64},
title = {{Collaborative Visual SLAM using Compressed Feature Exchange}},
volume = {4},
year = {2019}
}

Preprint is available online at IEEE here

Source Code:

The source code for the collaborative visual SLAM system can be found here.
The binary feature compression pipeline can be found here.

Please contact us via dominik (dot) van-opdenbosch [at] tum (dot) de using “RAL2018” as subject.

Knwon Issues:

The two images in Figure 5 are mixed up.

References:

  1. D. Van Opdenbosch, M. Oelsch, A. Garcea, T. Aykut, E. Steinbach, Selection and Compression of Local Binary Features for Remote Visual SLAM, IEEE International Conference on Robotics and Automation (ICRA), 2018
  2. R. Mur-Artal and J. D. Tardós, ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras, IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, 2017
  3. A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The KITTI vision benchmark suite, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354-3361, 2012
  4. M. Burri, J. Nikolic, P. Gohl, T. Schneider, J. Rehder, S. Omari, M. Achtelik and R. Siegwart, The EuRoC micro aerial vehicle datasets, International Journal of Robotic Research, vol. 35, no. 10, pp 1157-1163, 2016

Return to main page.