A review of machine learning for analysing accident reports in the construction industry
Reviewartikel, 2025

Recently, there has been a growth in the research interest on applied machine learning (ML) in safety analysis in the construction industry. The increased interest is part of a search for improved prevention of occupational accidents with a focus on text analysis and natural language processing (NLP). However, ML-based approaches have been less adapted compared to their perceived benefits due to barriers of implementation and challenges in analysing safety records in the construction sector. And the current literature has been criticized for a lack of clarity around the description of methodologies, interpretation, and the context of the application. Therefore, this work aims to review the latest developments in research applying ML to accident report analysis in construction. A review of the published literature on ML-based analysis of construction accident reports was carried out and organized in terms of the data pre-processing, algorithms, testing and implementation and further organized based on data structure. The results of the review found limitation related to data availability besides the manual structuring and the less use of unsupervised learning reflect complexity of handling textual accident data. Moreover, types of accidents happen in proportionally varying frequencies and need careful tackling outside basic assumptions of data pre-processing in addition to the general need for data pre-processing comparative studies and automated pipelines. The review also showed that data mining (DM) and unsupervised learning were less used especially with semi-structured and unstructured datasets reflecting maybe inefficient natural language processing (NLP) application with these types of learning. Among the reviewed articles, only four out of six prototypes were externally validated on construction environment thus we propose that future efforts would benefit from incorporating a standardized development method that also explicit how ML safety recommendation informs decision making. Future research should experiment and ascertain different choices in the pre-processing stage, validating the performance of the ML models and implementation in the construction practices. Finally, there are more advanced NLP methods that could be applied if domain specific repositories were available such as relation extraction and there are various advances that could be explored including large language models (LLMs).

Construction industry

Accident reports

Natural language processing

Safety

Machine learning

Författare

May Shayboun

Högskolan i Halmstad

Dimosthenis Kifokeris

Chalmers, Arkitektur och samhällsbyggnadsteknik, Byggnadsdesign

Christian Koch

Högskolan i Halmstad

Syddansk Universitet

Journal of Information Technology in Construction

18744753 (ISSN)

Vol. 30 439-460

Olycksförebyggande genom maskininlärning hos en byggentreprenör

Svenska Byggbranschens Utvecklingsfond (SBUF) (14159), 2022-10-01 -- 2025-04-01.

Ämneskategorier (SSIF 2025)

Byggprocess och förvaltning

DOI

10.36680/j.itcon.2025.019

Mer information

Senast uppdaterat

2025-04-14