When IC meets text: Towards a rich annotated integrated circuit text dataset
Artikel i vetenskaplig tidskrift, 2024

Automated Optical Inspection (AOI) is a process that uses cameras to autonomously scan printed circuit boards for quality control. Text is often printed on chip components, and it is crucial that this text is correctly recognized during AOI, as it contains valuable information. In this paper, we introduce \textit{ICText}, the largest dataset for text detection and recognition on integrated circuits. Uniquely, it includes labels for character quality attributes such as low contrast, blurry, and broken. While loss-reweighting and Curriculum Learning (CL) have been proposed to improve object detector performance by balancing positive and negative samples and gradually training the model from easy to hard samples, these methods have had limited success with one-stage object detectors commonly used in industry. To address this, we propose Attribute-Guided Curriculum Learning (AGCL), which leverages the labeled character quality attributes in \textit{ICText}. Our extensive experiments demonstrate that AGCL can be applied to different detectors in a plug-and-play fashion to achieve higher Average Precision (AP), significantly outperforming existing methods on \textit{ICText} without any additional computational overhead during inference. Furthermore, we show that AGCL is also effective on the generic object detection dataset Pascal VOC. Our code and dataset will be publicly available at \href{https://github.com/chunchet-ng/ICText-AGCL}{https://github.com/chunchet-ng/ICText-AGCL}.

Text detection

Text Recognition

Integrated circuit text dataset

Attribute-guided curriculum learnin

Optical character recognition

Författare

Chun Chet Ng

Universiti Malaya

Che-Tsung Lin

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Zhi Qin Tan

Universiti Malaya

Xinyu Wang

The University of Adelaide

Jie Long Kew

Universiti Malaya

Chee Seng Chan

Universiti Malaya

Christopher Zach

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Pattern Recognition

0031-3203 (ISSN)

Vol. 147 110124

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier

Språkteknologi (språkvetenskaplig databehandling)

Elektroteknik och elektronik

DOI

10.1016/j.patcog.2023.110124

Mer information

Senast uppdaterat

2023-12-08