Semantic Mapping and Visual Navigation for Smart Robots
Research Project, 2016
– 2021
The goal of this project is to develop an integrated framework for the next generation of autonomous vehicles capable of seeing, navigating and mapping based on computer vision and optimal control techniques. Examining today’s best-performing systems for visual processing reveals that problems such as object recognition and 3D scene reconstruction have largely been studied independently and there is no single, integrated modelling framework. One example can be found in multiple view geometry, where a major success story in recent years has been the ability to automatically reconstruct large-scale 3D models from collections of 2D images. However, the approach is based on purely geometric concepts, is mostly passive and utilizes no semantic scene understanding. Limitations are apparent: certain scene elements
cannot be reconstructed because geometry is under-constrained, mid-level gestalts and category specific priors cannot be easily leveraged, and the model ultimately provides a point cloud and a texture map, not a semantic representation that enables effective navigation or interaction. A virtuous modelling circle between reconstruction and recognition, enforced and enhanced by active exploration, is not in place. We appreciate that in order to reach further, in both theory and practice, it is necessary to develop perceptual systems based on sound mathematical models that are capable of integrating the different facets of perception and action in a collaborative manner. In this truly interdisciplinary project, we rely on expertise from computer vision, machine learning, automatic control and optimization in order to take the current state-of-the-art in autonomous systems to the next level of perception, cognition and navigation,
and towards key capabilities of robots able to effectively act in the real world.
The project focuses on models and algorithms for smart systems and for systems of systems. The system of systems approach will integrate different aspects of scene understanding, enable component compliance and consistency by end-to-end refinement, and achieve a high level of robustness and recovery from failures. By integrating components within computer vision, machine learning and optimal control,
we will be able to develop perceptual robotics systems that can semantically map, navigate, and interact in an unknown environment, see Fig. 1.
For demonstration, we will develop an autonomous system for the visual inventory inspection of a supermarket using small-scale, low-cost quadcopters. The system will provide a complete solution for visual navigation and 3D mapping where not only scene geometry is modelled, but also semantic constraints are integrated. Scientific challenges include the ability to recognise many object classes by learning deep
structured models with geometry and semantics, as well as intelligent obstacle avoidance within a general navigation paradigm where tasks like uncertainty reduction and semantic visual localization are jointly used within model predictive control frameworks for trajectory optimization. The demonstrator is interesting in its own right, but we want to make clear that such a system is only a low-cost test-bed to verify the research methodology developed in the project. In turn, the project advances will be generally applicable to perceptual robotic systems for autonomous vehicles, humanoid robots or flexible inspection.
Participants
Fredrik Kahl (contact)
Imaging and Image Analysis
Carl Olsson
Imaging and Image Analysis
Carl Toft
Imaging and Image Analysis
Funding
Swedish Foundation for Strategic Research (SSF)
Project ID: RIT15-0038
Funding Chalmers participation during 2016–2021