VT-EncNet: A Multi-Modal convolutional autoencoder with attention for vessel trajectory representation and similarity computation

Xinyu Wang; Zhao Liu; Yang Chen; Mingyang Zhang; Wengang Mao

doi:10.1016/j.tre.2026.104857

VT-EncNet: A Multi-Modal convolutional autoencoder with attention for vessel trajectory representation and similarity computation
Journal article, 2026

Vessel trajectory similarity computation is of great significance in maritime data mining, traffic safety control, and intelligent shipping management. However, real-world Automatic Identification System (AIS) data are affected by noise, irregular sampling, and highly diverse navigational behaviors, which significantly degrade the performance of traditional point-matching or geometry-based similarity measures. Furthermore, existing deep learning approaches often rely on single-modal representations—either treating trajectories as spatial images or temporal sequences—failing to explicitly capture the complementary coupling between global spatial topology and local kinematic dynamics. To address these limitations, this paper proposes the Vessel Trajectory Encoding Network (VT-EncNet), a multi-modal convolutional representation learning framework for robust trajectory similarity computation. The method employs a grid-based discretization strategy to construct multi-channel inputs integrating position, speed, and course. A dual-branch architecture, utilizing 2D and 1D Convolutional Neural Networks (CNNs), independently extracts spatial structural features and dynamic motion attributes. Subsequently, a cross-attention mechanism explicitly fuses these spatial and kinematic modalities, compressing them into discriminative low-dimensional latent vectors via an autoencoder. Extensive experiments on real-world passenger vessel trajectories from the Qiongzhou Strait demonstrate the superiority of VT-EncNet. Quantitatively, it achieves a state-of-the-art Adjusted Rand Index (ARI) of 0.7955 in unsupervised clustering, successfully disentangling complex, spatially overlapping traffic flows into 12 fine-grained behavioral patterns. Furthermore, the framework exhibits exceptional scalability, delivering significant computational acceleration over traditional exhaustive distance metrics. These findings highlight the model’s robust pattern recognition capability and profound practical value for large-scale intelligent maritime traffic management.

Vessel trajectory similarity

Maritime logistics and transportation

AIS data

Multi-modal representation learning

Convolutional autoencoder

Author

Xinyu Wang

Wuhan University of Technology

Zhao Liu

Wuhan University of Technology

Yang Chen

Wuhan University of Technology

Mingyang Zhang

Shanghai Jiao Tong University

Wengang Mao

Chalmers, Mechanics and Maritime Sciences (M2), Marine Technology

Other publications Research

Transportation Research Part E: Logistics and Transportation Review

1366-5545 (ISSN)

Vol. 211 104857

Subject Categories (SSIF 2025)

Robotics and automation

Transport Systems and Logistics

Computer Sciences

DOI

10.1016/j.tre.2026.104857

Publication data connected to DOI

More information

Latest update

4/28/2026

VT-EncNet: A Multi-Modal convolutional autoencoder with attention for vessel trajectory representation and similarity computation Journal article, 2026

Author

Xinyu Wang

Zhao Liu

Yang Chen

Mingyang Zhang

Wengang Mao

Transportation Research Part E: Logistics and Transportation Review

Subject Categories (SSIF 2025)

DOI

More information

Latest update

VT-EncNet: A Multi-Modal convolutional autoencoder with attention for vessel trajectory representation and similarity computation
Journal article, 2026