Beyond vision: A unified transformer with bidirectional attention for predicting driver perceived risk from multi-modal data
Journal article, 2025

Modeling driver perceived risk (or subjective risk) plays a critical role in improving driving safety, as different drivers often perceive varying levels of risk under identical conditions, prompting adjustments in their driving behavior. Driving is a complex activity involving multiple cognitive and perceptual processes, such as visual information, driver feedback, vehicle dynamics, and traffic and environmental conditions. However, existing models for subjective risk perception have yet to fully address the need for integrating multi-modal data. To address this gap, we present a Transformer-based model aimed at processing multimodal inputs in a unified manner to enhance the prediction of subjective risk perception. Unlike existing methodologies that extract features specific to each modality, it employs embedding layers to transform images, unstructured, and structured fields into visual and text tokens. Subsequently, bi-directional multimodal attention blocks with inter-modal and intra-modal attention mechanisms capture comprehensive representations of traffic scene images, unstructured traffic scene descriptions, structured traffic data, environmental statistics, and demographics. Experimental results show that the proposed unified model achieves superior predictive performance over existing benchmarks while maintaining reasonable interpretability. Furthermore, the model is generalizable, making it applicable to various multi-modal prediction tasks across different transportation contexts.

Traffic safety

Subjective risk perception

Driving behavior modeling

Multi-modal data fusion

Author

Dongjie Liu

Henan University of Chinese Medicine

Southeast University

Dawei Li

Southeast University

Hongliang Ding

Southwest Jiaotong University

Yang Cao

Southwest Jiaotong University

Kun Gao

Chalmers, Architecture and Civil Engineering, Geology and Geotechnics

Transportation Research, Part C: Emerging Technologies

0968-090X (ISSN)

Vol. 179 105270

Facilitating sustainable development of sharing micro-mobility and transit multi-modal transport systems (eFAST)

Swedish Energy Agency (P2022-00414), 2022-11-01 -- 2024-12-31.

Chalmers Area of Advance Transport, 2022-01-01 -- 2023-12-31.

Subject Categories (SSIF 2025)

Computer graphics and computer vision

Computer Sciences

DOI

10.1016/j.trc.2025.105270

More information

Latest update

7/17/2025