A synthetic fraud data generation methodology
Paper i proceeding, 2002

In many cases synthetic data is more suitable than authentic data for the testing and training of fraud detection systems. At the same time synthetic data suffers from some drawbacks originating from the fact that it is indeed synthetic and may not have the realism of authentic data. In order to counter this disadvantage, we have developed a method for generating synthetic data that is derived from authentic data. We identify the important characteristics of authentic data and the frauds we want to detect and generate synthetic data with these properties.

data generation methodology

fraud detection

system simulation

user simulation

synthetic test data


Emilie Lundin

Chalmers, Institutionen för datorteknik

Håkan Kvarnström

Chalmers, Institutionen för datorteknik

Erland Jonsson

Chalmers, Institutionen för datorteknik

Lecture Notes in Computer Science

0302-9743 (ISSN)

Vol. 2513 265-277