Using Active Learning in Intrusion Detection
Paper in proceedings, 2004
Intrusion Detection Systems (IDSs) have become an important part of operational computer security. They are the last line of defense against malicious hackers and help detect ongoing attacks as well as mitigate their damage. However, intrusion detection systems are not turnkey solutions but are heavily dependent on expensive and scarce security experts for successful operation. By emphasizing self-learning algorithms, we can reduce dependence on the domain expert but instead require massive amounts of labeled training data, another scarce resource in intrusion detection. In this paper we investigate whether an active learning algorithm can perform on a par with a traditional self-learning algorithm in terms of detection accuracy but using significantly less labeled data. Our preliminary findings indicate that the active learning algorithm generally performs better than the traditional learning algorithm given the same amount of training data. Moreover, the reduction of labeled data needed can be as much as 80 times, shown by comparing an active learner with a traditional learner with similar detection accuracy. Thus, active learning algorithms seem promising in that they can reduce the dependence on security experts in the development of new detection rules by better leveraging the knowledge and time of the expert.