A Multi-task Disentanglement Framework Guided by Pedestrian Attributes for Video-Based Clothes-Changing Person Re-Identification in Internet of Things
Artikel i vetenskaplig tidskrift, 2026
Person re-identification (ReID), a crucial technology for intelligent surveillance in Internet of Things (IoT) systems, aims to search for the target person among the non-overlapping surveillance cameras. Video-based clothes-changing person re-identification (VCC-ReID) has become essential due to the rich information in videos and its broad applications. Because clothes are attached to the human body, the clothes and pedestrian features are highly coupled when extracting features, making VCC-ReID challenging. To solve this challenge, we propose a Multi-Task Disentanglement Framework guided by Pedestrian Attributes (MTDF-PAttr), whose core is the cross-domain attribute distillation decoupling mechanism. Pedestrian attribute recognition (PAR) is used as an auxiliary task in MTDF-PAttr to guide feature decoupling, thereby enhancing the main task, VCC-ReID’s performance. Since the existing VCC-ReID dataset lacks PAR annotations, we employ knowledge distillation to train the auxiliary task, where the teacher network is a pre-trained video-based PAR network. To make the PAR teacher network have better accuracy, stronger generalization, and can identify more attributes, we propose a Multi-Dataset Fusion Framework for Pedestrian Attribute Recognition (MDFF-PAttr), whose core is the multi-teacher collaborative self-distillation mechanism. MDFF-PAttr can simultaneously use multiple datasets for training and provide a powerful teacher model for MTDF-PAttr to distill its auxiliary task. Experimental results demonstrate that MTDF-PAttr can achieve state-of-the-art performance in the VCC-ReID task, providing an effective method for intelligent surveillance systems in the IoT. Additionally, MDFF-PAttr can effectively enhance the accuracy and generalization of the PAR network.
knowledge distillation
Internet of things
pedestrian attribute recognition
feature decoupling
person re-identification
multi-task learning