When IC meets text: Towards a rich annotated integrated circuit text dataset
Journal article, 2024

Automated Optical Inspection (AOI) is a process that uses cameras to autonomously scan printed circuit boards for quality control. Text is often printed on chip components, and it is crucial that this text is correctly recognized during AOI, as it contains valuable information. In this paper, we introduce \textit{ICText}, the largest dataset for text detection and recognition on integrated circuits. Uniquely, it includes labels for character quality attributes such as low contrast, blurry, and broken. While loss-reweighting and Curriculum Learning (CL) have been proposed to improve object detector performance by balancing positive and negative samples and gradually training the model from easy to hard samples, these methods have had limited success with one-stage object detectors commonly used in industry. To address this, we propose Attribute-Guided Curriculum Learning (AGCL), which leverages the labeled character quality attributes in \textit{ICText}. Our extensive experiments demonstrate that AGCL can be applied to different detectors in a plug-and-play fashion to achieve higher Average Precision (AP), significantly outperforming existing methods on \textit{ICText} without any additional computational overhead during inference. Furthermore, we show that AGCL is also effective on the generic object detection dataset Pascal VOC. Our code and dataset will be publicly available at \href{https://github.com/chunchet-ng/ICText-AGCL}{https://github.com/chunchet-ng/ICText-AGCL}.

Text detection

Text Recognition

Integrated circuit text dataset

Attribute-guided curriculum learnin

Optical character recognition

Author

Chun Chet Ng

Universiti Malaya

Che-Tsung Lin

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Zhi Qin Tan

Universiti Malaya

Xinyu Wang

The University of Adelaide

Jie Long Kew

Universiti Malaya

Chee Seng Chan

Universiti Malaya

Christopher Zach

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Pattern Recognition

0031-3203 (ISSN)

Vol. 147 110124

Areas of Advance

Information and Communication Technology

Subject Categories

Language Technology (Computational Linguistics)

Electrical Engineering, Electronic Engineering, Information Engineering

DOI

10.1016/j.patcog.2023.110124

More information

Latest update

12/8/2023