Enhancing OCR-based Engineering Diagram Analysis by Integrating Diverse External Legends with VLMs
Journal article, 2025
diagrams
information extraction
vision language models
optical character recognition
multimodal prompt engineering
legends
Author
Vasil Shteriyanov
Eindhoven University of Technology
McDermott
Rimman Dzhusupova
Eindhoven University of Technology
McDermott
Jan Bosch
University of Gothenburg
Eindhoven University of Technology
Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering
Helena Holmström Olsson
Malmö university
Journal of Software: Evolution and Process
2047-7481 (eISSN)
Vol. 37 12 e70072Subject Categories (SSIF 2025)
Computer graphics and computer vision
DOI
10.1002/smr.70072