Interpretability in Contact-Rich Manipulation via Kinodynamic Images
Paper in proceeding, 2021

Deep Neural Networks (NNs) have been widely utilized in contact-rich manipulation tasks to model the complicated contact dynamics. However, NN-based models are often difficult to decipher which can lead to seemingly inexplicable behaviors and unidentifiable failure cases. In this work, we address the interpretability of NN-based models by introducing the kinodynamic images. We propose a methodology that creates images from kinematic and dynamic data of contact-rich manipulation tasks. By using images as the state representation, we enable the application of interpretability modules that were previously limited to vision-based tasks. We use this representation to train a Convolutional Neural Network (CNN) and we extract interpretations with Grad-CAM to produce visual explanations. Our method is versatile and can be applied to any classification problem in manipulation tasks to visually interpret which parts of the input drive the model's decisions and distinguish its failure modes, regardless of the features used. Our experiments demonstrate that our method enables detailed visual inspections of sequences in a task, and high-level evaluations of a model's behavior.

Author

Ioanna Mitsioni

Royal Institute of Technology (KTH)

Joonatan Mänttäri

Royal Institute of Technology (KTH)

Yiannis Karayiannidis

Chalmers, Electrical Engineering, Systems and control

John Folkesson

Royal Institute of Technology (KTH)

Danica Kragic

Royal Institute of Technology (KTH)

Proceedings - IEEE International Conference on Robotics and Automation

10504729 (ISSN)

Vol. 2021-May 10175-10181
9781728190778 (ISBN)

2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Xi'an, China,

Subject Categories (SSIF 2011)

Other Computer and Information Science

Computer Science

Computer Vision and Robotics (Autonomous Systems)

DOI

10.1109/ICRA48506.2021.9560920

More information

Latest update

3/14/2022