Decision-Making in Autonomous Driving using Reinforcement Learning
Doktorsavhandling, 2021

The main topic of this thesis is tactical decision-making for autonomous driving. An autonomous vehicle must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, learning-based strategies are considered in this thesis, which introduces different approaches based on reinforcement learning (RL).

A general decision-making agent, derived from the Deep Q-Network (DQN) algorithm, is proposed. With few modifications, this method can be applied to different driving environments, which is demonstrated for various simulated highway and intersection scenarios. A more sample efficient agent can be obtained by incorporating more domain knowledge, which is explored by combining planning and learning in the form of Monte Carlo tree search and RL. In different highway scenarios, the combined method outperforms using either a planning or a learning-based strategy separately, while requiring an order of magnitude fewer training samples than the DQN method.

A drawback of many learning-based approaches is that they create black-box solutions, which do not indicate the confidence of the agent's decisions. Therefore, the Ensemble Quantile Networks (EQN) method is introduced, which combines distributional RL with an ensemble approach, to provide an estimate of both the aleatoric and the epistemic uncertainty of each decision. The results show that the EQN method can balance risk and time efficiency in different occluded intersection scenarios, while also identifying situations that the agent has not been trained for. Thereby, the agent can avoid making unfounded, potentially dangerous, decisions outside of the training distribution.

Finally, this thesis introduces a neural network architecture that is invariant to permutations of the order in which surrounding vehicles are listed. This architecture improves the sample efficiency of the agent by the factorial of the number of surrounding vehicles.

Monte Carlo tree search

reinforcement learning

autonomous driving

tactical decision-making

aleatoric uncertainty

neural networks

epistemic uncertainty

Lecture room FB, Fysikgården 4, Chalmers
Opponent: Professor Ville Kyrki, Department of Electrical Engineering and Automation, Aalto University, Finland

Författare

Carl-Johan E Hoel

Chalmers, Mekanik och maritima vetenskaper, Fordonsteknik och autonoma system

Hoel, C. J., Wolff, K., Laine, L. Ensemble Quantile Networks: Uncertainty-Aware Reinforcement Learning with Applications in Autonomous Driving

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

IEEE Intelligent Vehicles Symposium, Proceedings,; (2020)p. 1563-1569

Paper i proceeding

Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving

IEEE Transactions on Intelligent Vehicles,; Vol. 5(2019)p. 294-305

Artikel i vetenskaplig tidskrift

Reinforcement Learning with Uncertainty Estimation for Tactical Decision-Making in Intersections

IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC,; (2020)

Paper i proceeding

Automated Speed and Lane Change Decision Making using Deep Reinforcement Learning

IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC,; (2018)p. 2148-2155

Paper i proceeding

An Evolutionary Approach to General-Purpose Automated Speed and Lane Change Behavior

Proceedings of 16th IEEE International Conference On Machine Learning And Applications (ICMLA),; (2017)

Paper i proceeding

Every year, over one million people are killed in traffic-related accidents. The traffic safety could be significantly improved by using artificial intelligence to replace human drivers. Furthermore, the productivity of commercial vehicles could also be increased when fewer human drivers are required. An artificial driver must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, this thesis considers an approach called reinforcement learning (RL), in which the artificial driver instead learns how to behave by exploring the world through trial and error. By rewarding desired behaviors and punishing undesired behaviors, the artificial driver can learn how to make correct decisions.

The results of the thesis show that the introduced RL-based methods can be used to teach an artificial driver how to behave in different simulated highway and intersection scenarios. The results also show that if the driver is provided with a simple model of the traffic scenario, it can learn a suitable behavior faster. Furthermore, a method that allows the driver to estimate how confident it is about its decisions is introduced. If the artificial driver encounters a situation that it has not seen before, such as a wild animal on the road, the driver can identify that it is uncertain about what to do and instead act in a precautionary way, to minimize the risk of an accident.

Styrkeområden

Transport

Ämneskategorier

Datorseende och robotik (autonoma system)

ISBN

978-91-7905-584-4

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5051

Utgivare

Chalmers tekniska högskola

Lecture room FB, Fysikgården 4, Chalmers

Opponent: Professor Ville Kyrki, Department of Electrical Engineering and Automation, Aalto University, Finland

Mer information

Senast uppdaterat

2021-10-15