LLMSecCode: Evaluating Large Language Models for Secure Coding

Anton Rydén; Erik Näslund; Elad Schiller; Magnus Almgren

doi:10.1007/978-3-031-76934-4_7

LLMSecCode: Evaluating Large Language Models for Secure Coding
Paper i proceeding, 2025

The rapid deployment of Large Language Models (LLMs) requires careful consideration of their effect on cybersecurity. Our work aims to improve the selection process of LLMs that are suitable for facilitating secure coding (SC). This raises challenging research questions, such as (RQ1) Which functionality can streamline the LLM evaluation? (RQ2) What should the evaluation measure? (RQ3) How to attest that the evaluation process is impartial? To address these questions, we introduce LLMSecCode, an open-source evaluation framework designed to assess LLM SC capabilities objectively. We validate the LLMSecCode implementation through experiments. We find a 10% and 9% difference in performance when varying parameters and prompts, respectively. We also compare some results to reliable external actors, where our results show a 5% difference. We strive to ensure the ease of use of our open-source framework and encourage further development by external actors. With LLMSecCode, we hope to encourage the standardization and benchmarking of LLMs’ capabilities in security-oriented code and tasks.

Evaluation

Large Language Models

Secure Coding

Författare

Anton Rydén

Student vid Chalmers

Erik Näslund

Student vid Chalmers

Elad Schiller

Nätverk och System

Forskning Andra publikationer

Magnus Almgren

Nätverk och System

Forskning Andra publikationer

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

03029743 (ISSN) 16113349 (eISSN)

Vol. 15349 LNCS 100-118
9783031769337 (ISBN)

8th International Symposium on Cyber Security, Cryptology, and Machine Learning, CSCML 2024
Be'er Sheva, Israel,

RICS2: Säkra IT-system för drift och övervakning av samhällskritisk infrastruktur

Myndigheten för samhällsskydd och beredskap, 2021-01-01 -- 2023-12-31.

Visa projekt

RIOT: Ett resilient sakernas internet

Myndigheten för samhällsskydd och beredskap (MSB2018-12526), 2019-01-01 -- 2023-12-31.

Visa projekt

Ämneskategorier (SSIF 2011)

Programvaruteknik

DOI

10.1007/978-3-031-76934-4_7

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2025-03-12

LLMSecCode: Evaluating Large Language Models for Secure Coding Paper i proceeding, 2025