Performance Evaluation of Intrusion Detection System using Selected Features and Machine Learning Classifiers

Main Article Content

Raja Azlina Raja Mahmood
AmirHossien Abdi
Masnida Hussin

Abstract

Some of the main challenges in developing an effective network-based intrusion detection system (IDS) include analyzing large network traffic volumes and realizing the decision boundaries between normal and abnormal behaviors. Deploying feature selection together with efficient classifiers in the detection system can overcome these problems.  Feature selection finds the most relevant features, thus reduces the dimensionality and complexity to analyze the network traffic.  Moreover, using the most relevant features to build the predictive model, reduces the complexity of the developed model, thus reducing the building classifier model time and consequently improves the detection performance.  In this study, two different sets of selected features have been adopted to train four machine-learning based classifiers.  The two sets of selected features are based on Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) approach respectively.  These evolutionary-based algorithms are known to be effective in solving optimization problems.  The classifiers used in this study are Naïve Bayes, k-Nearest Neighbor, Decision Tree and Support Vector Machine that have been trained and tested using the NSL-KDD dataset. The performance of the abovementioned classifiers using different features values was evaluated.  The experimental results indicate that the detection accuracy improves by approximately 1.55% when implemented using the PSO-based selected features than that of using GA-based selected features.  The Decision Tree classifier that was trained with PSO-based selected features outperformed other classifiers with accuracy, precision, recall, and f-score result of 99.38%, 99.36%, 99.32%, and 99.34% respectively.  The results show that using optimal features coupling with a good classifier in a detection system able to reduce the classifier model building time, reduce the computational burden to analyze data, and consequently attain high detection rate.

Downloads

Download data is not yet available.

Article Details

How to Cite
1.
Mahmood RAR, Abdi A, Hussin M. Performance Evaluation of Intrusion Detection System using Selected Features and Machine Learning Classifiers. Baghdad Sci.J [Internet]. 2021Jun.20 [cited 2021Aug.3];18(2(Suppl.):0884. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/6210
Section
article

References

Denning DE. An Intrusion-Detection Model. IEEE Transactions on Software Engineering. 1987;2,222–232.

Axelsson S. Intrusion Detection Systems: A Survey and Taxonomy. International Journal of Innovative Technology and Exploring Engineering. 2000;99, 1–15. https://doi.org/10.1.1.1.6603

Azeez NA, Bada TM, Misra S, Adewumi A, Van der Vyver C, Ahuja R. Intrusion Detection and Prevention Systems: An Updated Review. Advances in Intelligent Systems and Computing. 2020;1042, 685–696. https://doi.org/10.1007/978-981-32-9949-8_48

Debar H. An introduction to intrusion-detection systems. Proceedings of Connect. 2000;1-18.

Scarfone K, Mell P. Guide to intrusion detection and prevention systems (idps). NIST Spec Publ. 2007;800,94.

Khraisat A, Gondal I, Vamplew P, Kamruzzaman J, Alazab A. Hybrid Intrusion Detection System Based on the Stacking Ensemble of C5 Decision Tree Classifier and One Class Support Vector Machine. Electronics. 2020;9(1),173.

NSL-KDD Dataset for Network-Based Intrusion Detection Systems. 2020. Available online: https://www.unb.ca/cic/datasets/nsl.html (accessed on 10 February 2020)

Malhotra P, Sharma P. Intrusion detection using machine learning and feature selection. Int. J. Comput. Netw. Inf. Secur. 2019;4, 43–52.

Alabdulwahab S, Moon B. Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building Time of Machine Learning Classifiers. Symmetry. 2020;12(9), 1424.

Sarvari S, Muda Z, Ahmad I, Barati M. GA and SVM Algorithms for Selection of Hybrid Feature in Intrusion Detection Systems. Intl. Review on Computers and Software (IRECOS). 2015;10(3), 265–270.

Chakir EM, Moughit M, Khamlichi YI. An effective intrusion detection model based on svm with feature selection and parameters optimization. Journal of Theo-retical and Applied Information Technology. 2018;96(12), 3873–3885.

Liu S, Wang X, Liu M, Zhu J. Towards better analysis of machine learning models: A visual analytics perspective, Visual Informatics. 2017;1(1),48-56.

Dhanda N, Datta SS, Dhanda M. Machine Learning Algorithms. Journal of Communications and Information Networks. 2019;210–233. https://doi.org/10.4018/978-1-5225-7955-7.ch009

Amruthnath N, Gupta T. A Research Study on Unsupervised Machine Learning Algorithms for Early Fault Detection in Predictive Maintenance. Computers and Electrical Engineering. 2018;355–361. https://doi.org/10.13140/RG.2.2.28822.24648

Lewis D. Naive Bayes at forty: the independence assumption in information retriev-al. In Machine Learning: ECML-98, Proceedings of the 10th European Conference on Machine Learning, Chemnitz, Germany. 1998;4–15.

He F, Ding X. Improving Naive Bayes Text Classifier Using Smoothing Methods. In: Amati G, Carpineto C, Romano G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science. 2007;4425. Springer. https://doi.org/10.1007/978-3-540-71496-5_73

Granik M, Mesyura V. Fake news detection using naive Bayes classifier. IEEE First Ukraine Conference on Electrical and Computer Engineering (UKRCON), Kiev. 2017;900-903.

Xu S. Bayesian Naïve Bayes classifiers to text classification. Journal of Information Science. 2018;44(1), 48-59.

Sasongko TB, Arifin O, Al Fatta H. Optimization of Hyper Parameter Band-width on Naïve Bayes Kernel Density Estimation for the Breast Cancer Classification. 2019 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta. 2019;226-231.

Murakami Y. Mizuguchi K. Applying the Naive Bayes classifier with kernel density estimation to the prediction of protein-protein interaction sites. Bioinformatics. 2010;26, 1841-8.

Hand D, Mannila H, Smyth P. Principles of Data Mining:MIT Press, Cambridge. 2001.

Fix E, Hodges JL. Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties (Report). USAF School of Aviation Medicine, Randolph Field, Texas. 1951.

Cover T, Hart P. Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 1967;13(1),21-27.

Dasarathy BV. Nearest Neighbor (NN) Norms NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos. 1991.

Alizadeh H, Minaei B, Kasmani, AK Saeed. A New Method for Improving the Per-formance of K Nearest Neighbor using Clustering Technique. JCIT. 2009;4, 84-92.

Jiang S, Pang G, Wu M, Kuang L. An improved K-nearest-neighbor algorithm for text categorization, Expert Systems with Applications. 2012;39(1), 1503-1509. https://doi.org/10.1016/j.eswa.2011.08.040

Imandoust SB, Bolandraftar M. Application of K-nearest neighbor (KNN) approach for predicting economic events theoretical background. Int J Eng Res Appl. 2013;3, 605-610.

Quinlan JR. Induction of decision trees. Machine Learning. 1986;1, 81–106

Safavian SR, Landgrebe DA. Survey of Decision Tree Classifier Methodology. IEEE Transactions on Systems, Man and Cybernetics. 1991;21, 660-674.

Sani HM, Lei C, Neagu D. Computational Complexity Analysis of Decision Tree Algorithms. In: Bramer M., Petridis M. (eds) Artificial Intelligence XXXV. SGAI 2018. Lecture Notes in Computer Science:Springer 2018;11311. https://doi.org/10.1007/978-3-030-04191-5_17

Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20, 273–297. https://doi.org/10.1007/BF00994018

Burges CJ. A tutorial on support vector machines for pattern recognition. Data Mini. Knowl. Discov. 1998;2, 121–167.

Huynh P, Nguyen V, Do T. Novel hybrid DCNN–SVM model for classifying RNA-sequencing gene expression data, Journal of Information and Telecommunication. 2019;3(4), 533-547.

Li Z, Xie W, Liu T. Efficient feature selection and classification for microarray data. PLoS ONE. 2018;13(8), e0202167. https://doi.org/10.1371/journal.pone.0202167

Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A. A comprehensive survey on support vector machine classification: Applications, challenges and trends, Neurocomputing. 2020;408,189-215. https://doi.org/10.1016/j.neucom.2019.10.118

Relan NG, Patil DR. Implementation of network intrusion detection system using variant of decision tree algorithm. International Conference on Nascent Technologies in the Engineering Field. 2015;1–5. https://doi.org/10.1109/ICNTE.2015.7029925

Belavagi MC, Muniyal B. Performance Evaluation of Supervised Machine Learning Algorithms for Intrusion Detection. Procedia Computer Science. 2016;89, 117–123. https://doi.org/10.1016/j.procs.2016.06.016

Amira AS, Hanafi SEO, Hassanien AE. Comparison of classification techniques applied for network intrusion detection and classification. Journal of Applied Logic. 2017;24, 109–118. https://doi.org/10.1016/j.jal.2016.11.018

Suleiman MF, Issac B. Performance comparison of intrusion detection machine learning classifiers on benchmark and new datasets, 28th International Conference on Computer Theory and Applications (ICCTA 2018), Alexandria. 2018.

Devi RR, Abualkibash M. Intrusion Detection System Classification Using Different Machine Learning Algorithms on KDD-99 and NSL-KDD Datasets - A Review Paper. International Journal of Computer Science and Information Technology. 2019;11(03), 65–80. https://doi.org/10.5121/ijcsit.2019.11306

Al-Yaseen WL. Improving intrusion detection system by developing feature selec-tion model based on firefly algorithm and support vector machine. IAENG Interna-tional Journal of Computer Science, 2019;46(4), 1–7.

Najeeb RF, Dhannoon BN. A feature selection approach using binary Firefly Algorithm for network intrusion detection system. ARPN Journal of Engineering and Applied Sciences, 2018;13(6), 2347–2352.

Dash M, Liu H. Feature Selection for Classification. Intelligent Data Analysis. 1997;1(3), 131–156.

Miao J, Niu L. A Survey on Feature Selection. Procedia Computer Science, 91 (Itqm). 2017;919–926. https://doi.org/10.1016/j.procs.2016.07.111

Xue B, Zhang M, Browne WN. Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Applied Soft Computing. 2014;18, 261–276.

Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. Procedia Computer Science. 2017;372–378. https://doi.org/10.1109/SAI.2014.6918213

Nolan DR, Lally C. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. Journal of Computational Science. 2018;24,132–142. https://doi.org/10.1016/j.jocs.2017.04.009

Ambusaidi MA, He X, Nanda P, Tan Z. Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Transactions on Computers. 2016;65(10),2986–2998. https://doi.org/10.1109/TC.2016.2519914

Thaseen IS, Kumar CA. Intrusion detection model using fusion of chi-square feature selection and multi class SVM. Journal of King Saud University - Computer and Information Sciences. 2017;29(4),462–472. https://doi.org/10.1016/j.jksuci.2015.12.004

Aghdam MH, Ghasem-Aghaee N, Basiri ME. Text Feature Selection Using Ant Colony Optimization, Expert Systems with Applications. 2009;36(3), 6843-6853. https://doi.org/10.1016/j.eswa.2008.08.022

Aslahi-Shahri BM, Rahmani R, Chizari M. et al. A hybrid method consisting of GA and SVM for intrusion detection system. Neural Computing and Applications. 2016;27(6),1669–1676.

Zhang Y, Gong D, Hu Y, Zhang W. Feature selection algorithm based on bare bones particle swarm optimization. Neurocomputing. 2015;148, 150–157.

Xue Y, Jia W, Zhao X, Pang W, Meng W. An Evolutionary Computation Based Feature Selection Method for Intrusion Detection. Sec. and Commun. Netw. 2018. https://doi.org/10.1155/2018/2492956

Kennedy J, Eberhart R. Particle swarm optimization. Proc Neural Networks. Proceedings of IEEE International Conference, 1944. 1995;1942e8.

Li L, Yu Y, Bai S, Cheng J, Chen X. Towards Effective Network Intrusion Detection: A Hybrid Model Integrating Gini Index and GBDT with PSO. Journal of Sensor. 2018;9. https://doi.org/10.1155/2018/1578314

Tavallaee M, Bagheri E, Lu W, Ghorbani A. A Detailed Analysis of the KDD CUP 99 Data Set. Proceeding of the IEEE Symposium on Computational Intel-ligence for Security and Defense Applications (CISDA 2009). 2009.

Ding Y, & Zhai Y. Intrusion Detection System for NSL-KDD Dataset Using Convolutional Neural Networks. Proceedings of the 2018 2nd In-ternational Conference on Computer Science and Artificial Intelligence (CSAI '18), New York. 2018;81–85. https://doi.org/10.1145/3297156.3297230

Ingre B, Yadav A. Performance analysis of NSL-KDD dataset using ANN. Proceedings of the IEEE International Conference on Signal Processing and Communication Engineering Systems, Guntur. 2015;92–96.

Su T, Sun H, Zhu J, Wang S, Li Y. (). BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset:IEEE Access. 2020;8,29575-29585.

Yu Y, Bian N. An Intrusion Detection Method Using Few-Shot Learning. IEEE Access, 8, 49730-49740.

Atla A, Tada R, Sheng V, Singireddy N. Sensitivity of different machine learning algorithms to noise. J. Comput. Sci. Coll. 2011;2020;26(5),96–103.

Veropoulos K, Campbell C, Cristianini N. Controlling the Sensitivity of Support Vector Machines. Proceedings of International Joint Conference Artificial Intelligence. 1999.