Rewarding Change beyond State: Directional VLM Rewards for Sample-Efficient Robot Reinforcement Learning *
Paper i proceeding, 2026

Sparse rewards are a persistent bottleneck for robotic manipulation with Reinforcement Learning (RL), primarily because RL agents must discover long-horizon, multi-step behaviors while receiving infrequent and weakly informative feedback. Recent work uses pre-trained Vision Language Models (VLMs) to provide dense per-step rewards, yet most approaches score only a single image against a goal text, ignoring whether the recent change actually moves the system toward success. We argue that this omission impairs exploration (e.g., goal-like detours, wrong-way progress, action aliasing) and propose to make time explicit in VLM rewards by adding a directional signal that evaluates short-horizon change. Concretely, we pair visual change over a few steps with a text description of the desired change, and finetune lightweight heads with RL; the resulting directional signal is combined with a standard positional signal into a single shaping reward. We evaluated our approach in six MetaWorld manipulation tasks with fixed goals. This directional shaping improves running average success at a fixed budget to 78.2%, versus 63.8% for the best-tuned positional baseline (improvements were observed in five of six tasks). Ablations identify key design choices for the proposed directional term to be effective and show its synergy with the positional term when supplying dense VLM rewards, demonstrating improved exploration and sample efficiency.

Författare

Linus Lundgren

Student vid Chalmers

Wenhao Lu

Chalmers, Elektroteknik, System- och reglerteknik

Zhitao Liang

Chalmers, Elektroteknik, System- och reglerteknik

Ze Zhang

Göteborgs universitet

Chalmers, Data- och informationsteknik, Dator- och nätverkssystem

Karinne Ramirez-Amaro

Chalmers, Elektroteknik, System- och reglerteknik

Emmanuel Dean

Chalmers, Elektroteknik, System- och reglerteknik

2026 IEEE SICE International Symposium on System Integration Sii 2026

722-728
9781665457842 (ISBN)

2026 IEEE/SICE International Symposium on System Integration, SII 2026
Cancun, Mexico,

Ämneskategorier (SSIF 2025)

Robotik och automation

Datavetenskap (datalogi)

DOI

10.1109/SII64115.2026.11404492

Mer information

Senast uppdaterat

2026-04-23