Robust AI

Robust AI

Towards developing trustworthy behavior for autonomous vehicles using reinforcement learning

Robust AI

The project explored the impediments encountered in developing a behavior planner using reinforcement learning and its deployment on fortiss autonomous research vehicle. The applicability of the Deloitte Trustworthy AI Framework to the field of reinforcement learning is explored and necessary future enhancements are given.

Project description

Deloitte’s trustworthy AI framework describes criteria that AI algorithms must satisfy to gain human trust. Guaranteeing trustworthiness is especially relevant in safety-critical applications such as autonomous driving and remains one of the open challenges. Essential aspects for developing trustworthy AI algorithms using supervised or unsupervised learning paradigms have been structured in Deloitte’s trustworthy AI framework.

Reinforcement learning is a well-known learning paradigm that has received significant attention in robotics and autonomous driving. It enables learning of complex driving behavior offline in simulation, which thus reduces computational demands of the motion planning component. However, the trustworthiness criteria of other learning paradigms cannot be straightforwardly transferred to applications of reinforcement learning. In the context of ​Robuste KI project, Deloitte collaborated with fortiss to investigate the usability and completeness of their trustworthy AI framework in reinforcement learning.

Research contribution

Graphic refinement of OD
Overview of the development process of behaviorplanner using reinforcement learning.

When deploying to physical system, the ideal input data (in simulated environment) is no longer available. Therefore, the reinforcement learning based motion planning module needs to be robust to degraded input data and system performance, such as observation uncertainty and random delays, while still maintaining valuable performance.

The project investigated reward function design derived from the requirements of trustworthiness and performance. Reward shaping is applied to accelerate the learning process. In the context of data-centric AI, we implemented meaningful scenario generation approaches to cover the ODD specification and explored the possibility to improve AI robustness against these impediments by introducing them already in the generation of training scenarios. The results are merged to fortiss open-source software BARK and autonomous driving stack Apollo.


Supported by Deloitte GmbH

Project duration

15.01.2022 - 30.05.2022

 Xiangzhong Liu

Your contact

Xiangzhong Liu

+49 89 3603522 182

More information

Project partner