Abstract
Machine learning based intrusion detection systems monitor network data streams for cyber attacks. Challenges in this space include detecting unknown attacks, adapting to changes in the data stream such as changes in underlying behavior, the human cost of labeling data to retrain the machine learning model and the processing and memory constraints of a real-time data stream. Failure to manage the aforementioned factors could result in missed attacks, degraded detection performance, unnecessary expense or delayed detection times. This research proposes a new semi-supervised network data stream anomaly detection method, Split Active Learning Anomaly Detector (SALAD), which combines our novel Adaptive Anomaly Threshold and Stochastic Anomaly Threshold with Fading Factor methods. A novel Reconstruction Error based Distance from Threshold strategy is proposed and evaluated as part of an active stream framework to demonstrate reduction in labeling costs. The proposed methods are evaluated with the KDD Cup 1999, and UNSW-NB15 data sets, using the scikit-multiflow framework. Results demonstrated that the proposed SALAD method offered equivalent performance to full labeled and alternative Naïve Bayes (NB) and Hoeffding Adaptive Tree (HAT) methods, with a labeling budget of just 20%, significantly reducing the required human expertise to annotate the network data. Processing times of the SALAD method were demonstrated to be significantly lower than NB and HAT methods, allowing for greatly improved responsiveness to attacks occurring in real time.
More Information
Divisions: | School of Built Environment, Engineering and Computing |
---|---|
Identification Number: | https://doi.org/10.1016/j.eswa.2024.123439 |
Status: | Published |
Refereed: | Yes |
Publisher: | Elsevier |
Additional Information: | © 2024 Elsevier Ltd |
Uncontrolled Keywords: | 01 Mathematical Sciences; 08 Information and Computing Sciences; 09 Engineering; Artificial Intelligence & Image Processing |
SWORD Depositor: | Symplectic |
Depositing User (symplectic) | Deposited by Mann, Elizabeth |
Date Deposited: | 22 Mar 2024 15:51 |
Last Modified: | 23 Jul 2024 14:41 |
Item Type: | Article |
Download
Due to copyright restrictions, this file is not available for public download. For more information please email openaccess@leedsbeckett.ac.uk.
Export Citation
Explore Further
Read more research from the author(s):
- C Nixon ORCID: 0000-0002-3896-3027
- M Sedky ORCID: 0000-0002-9169-2449
- J Champion
- M Hassan