Efficient Tensor Compression and Reconstruction in Split DNNs for Edge-Based Object Detection
Artikel i vetenskaplig tidskrift, 2026
Computer Vision (CV) tasks are among the most pivotal, yet challenging, operations for Uncrewed Aerial Vehicles (UAVs), especially in mission-critical applications. They require processing complex image data through Deep Neural Networks (DNNs), which demand computational resources far beyond UAVs’ capacity. To address this limitation, Split DNNs offer a promising solution by partitioning the model into: (i) a lightweight Head, deployed on the UAV for rapid, albeit less precise, initial image representations, and (ii) a more complex Tail, executed at the network edge for refined, higher-accuracy results. However, this solution necessitates transmitting large tensor data from the UAV to the edge server, leading to significant bandwidth consumption. We tackle this challenge by introducing a goal-oriented framework named Compressed Tensor-based DNN Split (CoTeD). Our framework integrates an application- and system-aware optimization model that orchestrates computing and transmission resources in real time. At the UAV, CoTeD dynamically selects relevant tensor information and optimally trades-off between DNN detection quality and bandwidth consumption, guided by application requirements and system operational conditions. At the edge server, CoTeD reconstructs the tensor, enabling efficient inference by the Tail model. This approach effectively balances bandwidth usage with quality of the CV task output. Experimental results, obtained through our hardware-software testbed and using datasets with different sizes and characteristics, show that CoTeD can reduce data transmission over the radio link by up to 90% without noticeable loss in object detection quality and inference latency by up to 70% compared to local DNN deployment onboard the UAV. Also, CoTeD yields an inference request success rate of at least 90%, with an increase of 20%-80% compared to direct DNN splitting, static JPEG compression, and DNN model quantization.
Bandwidth utilization
Edge computing
UAVs
Orchestration