BlueprintSymVL: A discriminative benchmark for VLM symbol recognition in engineering blueprints
Artikel i vetenskaplig tidskrift, 2025

The application of Vision Language Models (VLMs) to industrial automation, specifically engineering blueprint analysis, is severely hampered by the absence of domain-specific evaluation tools. Existing benchmarks fail to replicate the critical visual challenges of this domain, such as high symbol density, occlusion, and visual similarity. Furthermore, they assume reliable pre-trained knowledge or standardized symbology, which rarely hold in real-world industrial settings. To address these critical gaps, we introduce BlueprintSymVL, the first benchmark explicitly designed to evaluate VLM symbol recognition in engineering blueprints. BlueprintSymVL is engineered as a strong discriminator, with test cases that systematically introduce challenges to differentiate model capabilities. A key innovation is our robust evaluation method, centered on a one-shot visual in-context querying strategy. At query time, the model is provided with a visual exemplar of a symbol. This approach eliminates reliance on unreliable pre-existing knowledge and is paired with a strict evaluation criterion demanding correctness on both symbol counts and their labels, setting a rigorous standard for quality assurance in high-stakes applications. We conducted a comprehensive benchmark of four leading VLMs (GPT-4o, Gemini 2.5 Pro, InternVL 2.5 78B, and Qwen 2.5 VL 72B). Our analysis provides the first baseline on their readiness, revealing that BlueprintSymVL is highly discriminative. We pinpoint specific failure modes, including a notable degradation in cluttered environments, confusion when faced with visually similar distractors, and a concerning propensity to hallucinate symbols. These insights demonstrate that current VLMs are not yet suitable for autonomous deployment in blueprint analysis and are best integrated into human-in-the-loop workflows.

Vision Language Models (VLMs)

Benchmark

Visual in-context learning

Engineering blueprints

Symbol recognition

Författare

Vasil Shteriyanov

McDermott

Technische Universiteit Eindhoven

Rimman Dzhusupova

Technische Universiteit Eindhoven

McDermott

Jan Bosch

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

Technische Universiteit Eindhoven

Göteborgs universitet

Helena Holmström Olsson

Malmö universitet

Results in Engineering

25901230 (eISSN)

Vol. 28 108171

Ämneskategorier (SSIF 2025)

Programvaruteknik

Datavetenskap (datalogi)

DOI

10.1016/j.rineng.2025.108171

Relaterade dataset

Benchmark Dataset for VLM Symbol Recognition in Engineering Blueprints [dataset]

DOI: https://doi.org/10.5281/zenodo.17250377

Mer information

Senast uppdaterat

2025-11-24