Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images
DOI:
https://doi.org/10.21123/bsj.2024.9422Keywords:
Chest X-ray (CXR), COVID-19, Decision Tree, Gaussian Naïve, Stochastic Gradient Descent. Bayes, Machine LearningAbstract
The rapid spread of the COVID-19 pandemic has strained global healthcare systems, necessitating efficient diagnostic methods. While Polymerase Chain Reaction (PCR) and antigen tests are common, they have limitations in speed and precision. Enhancing the accuracy of imaging techniques, especially Chest X-rays (CXR) and Computerized Tomography (CT) scans, is crucial for detecting COVID-19-related lung abnormalities. CXR, being cost-effective and accessible, is preferred over CT scans, but accurate diagnosis often requires technological support. To address this, an extensive dataset of CXR images categorized into five classes is available on Kaggle. Processing such data involves steps like grayscale conversion, image intensity adjustment, resizing, and feature extraction using Principal Component Analysis (PCA). Machine Learning (ML) techniques, including Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Gaussian Naive Bayes (GNB), and K-Nearest Neighbors (KNN), are employed for image classification. DT shows the highest accuracy at 88%, outperforming other models like GNB (77%), KNN (71%), SGD (70%), LR (74%), and RF (45%). It consistently excels across assessment metrics such as F1-score, sensitivity, and precision, with an 88% best-weighted average. However, selecting the optimal ML model depends on factors like dataset characteristics and implementation specifics. Thus, careful consideration of these factors is crucial when choosing an ML model for COVID-19 diagnosis via CXR image classification.
Received 12/09/2023
Revised 19/04/2024
Accepted 21/04/2024
Published Online First 20/08/2024
References
Kaheel H, Hussein A, Chehab A. AI-Based Image Processing for COVID-19 Detection in Chest CT Scan Images. Front Comms Net. 2021; 2(Aug):1–12. https://doi.org/10.3389/frcmn.2021.645040
Too J, Mirjalili S. A hyper learning binary dragonfly algorithm for feature selection: A COVID-19 case study. Knowl. Based Syst. 2021; 212: 106553. https://doi.org/10.1016/j.knosys.2020.106553
Shen C, Yu N, Cai S, Zhou J, Sheng J, Liu K, et al. Quantitative computed tomography analysis for stratifying the severity of Coronavirus Disease 2019. J Pharm Anal. 2020; 10(2): 123–9. https://doi.org/10.1016/j.jpha.2020.03.004
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639): 115–8. https://doi.org/10.1038/nature21056
Soni M, Shnan MA. Scalable Neural Network Algorithms for High Dimensional Data. Mesopotamian J Big Data. 2023; 1–11. https://doi.org/10.58496/MJBD/2023/001
Adadi A, Lahmer M, Nasiri S. Artificial Intelligence and COVID-19: A Systematic umbrella review and roads ahead. J King Saud Univ Inf Sci. 2022; 34(8): 5898–920. https://doi.org/10.1016/j.jksuci.2021.07.010
Bachtiger P, Peters NS, Walsh SLF. Machine learning for COVID-19—asking the right questions. Lancet Digit Heal. 2020; 2(8): e391–2. https://doi.org/10.1016/s2589-7500(20)30162-x
Gupta VK, Gupta A, Kumar D, Sardana A. Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model. Big Data Min Anal. 2021; 4(2): 116–23. https://doi.org/10.26599/bdma.2020.9020016
Shaikh F, Andersen MB, Sohail MR, Mulero F, Awan O, Dupont-Roettger D, et al. Current landscape of imaging and the potential role for artificial intelligence in the management of COVID-19. Curr Probl Diagn Radiol. 2021; 50(3): 430–5. https://doi.org/10.1067/j.cpradiol.2020.06.009
Elmokadem AH, Mounir AM, Ramadan ZA, Elsedeiq M, Saleh GA. Comparison of chest CT severity scoring systems for COVID-19. Eur Radiol. 2022; 1–12. https://doi.org/10.1007/s00330-021-08432-5
Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, et al. COVID-19 patient health prediction using boosted random forest algorithm. Front public Heal. 2020; 8: 357. https://doi.org/10.46632/daai/3/2/13
Abbood EA, Al-Assadi TA. GLCMs Based multi-inputs 1D CNN Deep Learning Neural Network for COVID-19 Texture Feature Extraction and Classification. Karbala Int J Mod Sci. 2022; 8(1): 28–39. https://doi.org/10.33640/2405-609x.3201
Ufuk F, Demirci M, Uğurlu E, Çetin N, Yiğit N, Sarı T. Evaluation of disease severity with quantitative chest CT in COVID-19 patients. Diagn Interv Radiol. 2021; 27(2): 164. https://doi.org/10.5152/dir.2020.20281
Kareem OS. Face Mask Detection Using Haar Cascades Classifier To Reduce The Risk Of Coved-19. Int J Math Stat Comput Sci. 2024; 2: 19–27. https://doi.org/10.59543/ijmscs.v2i.7845
Rasheed J, Hameed AA, Djeddi C, Jamil A, Al-Turjman F. A machine learning-based framework for diagnosis of COVID-19 from chest X-ray images. Interdiscip Sci Comput Life Sci. 2021 Mar 1; 13(1): 103–17. https://doi.org/10.1007/s12539-020-00403-6
Gouda W, Almurafeh M, Humayun M, Jhanjhi NZ. Detection of COVID-19 based on chest X-rays using deep learning. In: Healthcare. 2022; 10(2): 343. https://doi.org/10.3390/healthcare10020343
Erdaw Y, Tachbele E. Machine learning model applied on chest X-ray images enables automatic detection of COVID-19 cases with high accuracy. Int J Gen Med. 2021; 14: 4923–31. https://doi.org/10.2147/IJGM.S325609
Ahmad HK, Milne MR, Buchlak QD, Ektas N, Sanderson G, Chamtie H, et al. Machine Learning Augmented Interpretation of Chest X-rays: A Systematic Review. Diagnostics. 2023; 13(4): 743. https://doi.org/10.3390/diagnostics13040743
Kwekha-Rashid AS, Abduljabbar HN, Alhayani B. Coronavirus disease (COVID-19) cases analysis using machine-learning applications. Appl Nanosci . 2023; 13(3): 2013–25. https://doi.org/10.1007/s13204-021-01868-7
Heidari A, Jafari Navimipour N, Unal M, Toumaj S. Machine learning applications for COVID-19 outbreak management. Neural Comput Appl. 2022; 34(18): 15313–15348. https://doi.org/10.1007/s00521-022-07424-w
Akhtar A, Akhtar S, Bakhtawar B, Kashif AA, Aziz N, Javeid MS. COVID-19 Detection from CBC using Machine Learning Techniques. Int J Technol Innov Manag. 2021; 1(2): 65–78. https://doi.org/10.54489/ijtim.v1i2.22
Khoshbakhtian F, Ashraf AB, Khan SS. Covidomaly: A deep convolutional autoencoder approach for detecting early cases of covid-19. arXiv Prepr arXiv201002814. 2020.
Mahesh B. Machine Learning Algorithms. A Review. Int J Sci Res. 2020; 9(1): 381–6. http://dx.doi.org/10.21275/ART20203995
Zaki SM, Jaber MM, Kashmoola MA. Diagnosing COVID-19 Infection in Chest X-Ray Images Using Neural Network. Baghdad Sci J. 2022; 19(6): 1356–61. https://doi.org/10.21123/bsj.2022.5965
Eljamassi DF, Maghari AY. COVID-19 Detection from Chest X-ray Scans using Machine Learning. Proc 2020. Int Conf Promis Electron Technol. ICPET 2020. 2020; 1–4. https://doi.org/10.1109/ICPET51420.2020.00009
Samsir S, Sitorus JHP, Ritonga Z, Nasution FA, Watrianthos R. Comparison of machine learning algorithms for chest X-ray image COVID-19 classification. J Phys Conf Ser.. 2021; 1933(1): 012040. https://doi.org/10.1088/1742-6596/1933/1/012040
Mohammad-Rahimi H, Nadimi M, Ghalyanchi-Langeroudi A, Taheri M, Ghafouri-Fard S. Application of machine learning in diagnosis of COVID-19 through X-ray and CT images: a scoping review. Front Cardiovasc Med. 2021; 8: 638011. https://doi.org/10.3389/fcvm.2021.638011
Zargari Khuzani A, Heidari M, Shariati SA. COVID-Classifier: An automated machine learning model to assist in the diagnosis of COVID-19 infection in chest x-ray images. Sci Rep. 2021; 11(1): 9887. https://doi.org/10.1038/s41598-021-88807-2
Johri S, Goyal M, Jain S, Baranwal M, Kumar V, Upadhyay R. A novel machine learning‐based analytical framework for automatic detection of COVID‐19 using chest X‐ray images. Int J Imaging Syst Technol. 2021; 31(3): 1105–19. https://doi.org/10.1002/ima.22613
Alafif T, Tehame AM, Bajaba S, Barnawi A, Zia S. Machine and deep learning towards COVID-19 diagnosis and treatment: survey, challenges, and future directions. Int J Environ Res Public Health. 2021; 18(3): 1117. https://doi.org/10.3390/ijerph18031117
Cavallo AU. Texture Analysis in the Evaluation of COVID-19 Pneumonia in Chest X-Ray Images: A Proof of Concept Study. Curr Med Imaging Rev. 2022; 17(9): 1094–102. https://doi.org /10.2174/1573405617999210112195450
Ahmed Ali H, Hariri W, Smaoui Zghal N, Ben Aissa D. A Comparison of Machine Learning Methods for best Accuracy COVID-19 Diagnosis Using Chest X-Ray Images. 2022 IEEE 9th Int Conf Sci Electron Technol Inf Telecommun SETIT 2022. 2022; (Ml): 349–55. https://doi.org/ 10.1109/SETIT54465.2022.9875477
Arif ZH, Cengiz K. Severity Classification for COVID-19 Infections based on Lasso-Logistic Regression Model. Int J Math Stat Comput Sci. 2022; 1: 25–32. https://doi.org/10.59543/ijmscs.v1i.7715
Dara OA, Lopez-Guede JM, Raheem HI, Rahebi J, Zulueta E, Fernandez-Gamiz U. Alzheimer’s Disease Diagnosis Using Machine Learning: A Survey. Appl Sci. 2023; 13(14): 8298. https://doi.org/10.3390/app13148298
Sheikh BU, Zafar A. Robust Medical Diagnosis: A Novel Two-Phase Deep Learning Framework for Adversarial Proof Disease Detection in Radiology Images. J Imaging Inform Med. 2024; 37(1): 308–338. https://doi.org/10.1007/s10278-023-00916-8
Peng T, Wang Y, Xu TC, Chen X. Segmentation of lung in chest radiographs using hull and closed polygonal line method. IEEE Access. 2019; 7: 137794–810. https://doi.org/10.1109/access.2019.2941511
Attallah O. A deep learning-based diagnostic tool for identifying various diseases via facial images. Digit Heal. 2022; 8: 20552076221124430. https://doi.org/10.1177/20552076221124432
Zaman A, Khattak SS, Hassan Z. Medical Imaging for the Detection of Tuberculosis Using Chest Radio Graphs. In: 2019 International Conference on Advances in the Emerging Computing Technologies (AECT). IEEE. 2020; 1–5. https://doi.org/10.1109/aect47998.2020.9194212
Issarti I, Consejo A, Jiménez-García M, Hershko S, Koppen C, Rozema JJ. Computer aided diagnosis for suspect keratoconus detection. Comput Biol Med. 2019; 109: 33–42. https://doi.org/10.1016/j.compbiomed.2019.04.024
Alavijeh FS, Mahdavi-Nasab H. Multi-scale morphological image enhancement of chest radiographs by a hybrid scheme. J Med Signals Sens. 2015; 5(1):5 9. PMID: 25709942
Rim B, Kim J, Hong M. Gender classification from fingerprint-images using deep learning approach. In: Proceedings of the international conference on research in adaptive and convergent systems. 2020; 7–12. https://doi.org/10.1145/3400286.3418237
Phung VH, Rhee EJ. A high-accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets. Appl Sci. 2019; 9(21): 4500. https://doi.org/10.3390/app9214500
Mhawi DN, Hashem SH. Proposed hybrid correlation feature selection forest panalized attribute approach to advance IDSs. Mod Sci. 2021; 7: 15. https://doi.org/10.33640/2405-609x.3166
Ebied HM. Feature extraction using PCA and Kernel-PCA for face recognition. In: 2012 8th International Conference on Informatics and Systems (INFOS). IEEE. 2012. MM–72.
Karamizadeh S, Abdullah SM, Manaf AA, Zamani M, Hooman A. An overview of principal component analysis. J Signal Inf Process. 2013;4(3B):173. https://doi.org/10.4236/jsip.2013.43B031
Poon B, Amin MA, Yan H. PCA based human face recognition with improved method for distorted images due to facial makeup. In: Proceedings of the international multi conference of engineers and computer scientists, Hong Kong. 2017.
Reza MS, Ma J. ICA and PCA integrated feature extraction for classification. In: 2016 IEEE 13th International Conference on Signal Processing (ICSP). IEEE. 2016; 1083–8. https://doi.org/10.1109/icsp.2016.7877996
Charbuty B, Abdulazeez A. Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends. 2021; 2(01): 20–8. https://doi.org/10.38094/jastt20165
Navada A, Ansari AN, Patil S, Sonkamble BA. Overview of use of decision tree algorithms in machine learning. Pro IEEE Control Syst Grad Res Colloquium, ICSGRC. 2011; 37–42. https://doi.org/10.1109/ICSGRC.2011.5991826
Breiman L. Random forests. Mach Learn. 2001; 45: 5–32. https://doi.org/10.1023/A:1010933404324
Bottou L. Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers. Springer. 2010; p. 177–86. https://doi.org/10.1007/978-3-7908-2604-316
Emon MU, Islam R, Keya MS, Zannat R. Performance Analysis of Chronic Kidney Disease through Machine Learning Approaches. In: 2021 6th International Conference on Inventive Computation Technologies (ICICT). IEEE. 2021; 713–9. https://doi.org/10.1109/ICICT50816.2021.9358491
Maalouf M. Logistic regression in data analysis: an overview. Int J Data Anal Tech Strateg. 2011;3(3):281–99. https://doi.org/10.1504/IJDATS.2011.041335
Gupta H V, Kling H, Yilmaz KK, Martinez GF. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J Hydrol. 2009; 377(1–2): 80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003
Romadhon MR, Kurniawan F. A comparison of naive Bayes methods, logistic regression and KNN for predicting healing of Covid-19 patients in Indonesia. In: 3rd east Indonesia conference on computer and information technology (eiconcit). IEEE. 2021; 41–4. https://doi.org/10.1109/EIConCIT50028.2021.9431845
Zhang M-L, Zhou Z-H. A k-nearest neighbor based algorithm for multi-label classification. In: IEEE international conference on granular computing. IEEE. 2005; 718–21. https://doi.org/10.1109/GRC.2005.1547385
Ontivero-Ortega M, Lage-Castellanos A, Valente G, Goebel R, Valdes-Sosa M. Fast Gaussian Naïve Bayes for searchlight classification analysis. Neuroimage. 2017; 163: 471–9. https://doi.org/10.1016/j.neuroimage.2017.09.001
Jahromi AH, Taheri M. A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features. In: Artificial intelligence and signal processing conference (AISP). IEEE. 2017; 209–12. https://doi.org/10.1109/AISP.2017.8324083
Haghighi S, Jasemi M, Hessabi S, Zolanvari A. PyCM: Multiclass confusion matrix library in Python. J Open Source Softw. 2018; 3(25): 729. https://doi.org/10.21105/joss.00729
Ali HA, Zghal NS, Hariri W, Aissa D Ben. Fast Hybrid Deep Neural Network for Diagnosis of COVID-19 using Chest X-Ray Images. Int J Adv Comput Sci Appl. 2023; 14(3): 553–64. https://doi.org/10.14569/ijacsa.2023.0140364
Haouli I-E, Hariri W, Seridi-Bouchelaghem H. COVID-Attention: Efficient COVID19 Detection using Pre-trained Deep Models Based on Vision Transformers and X-ray Images. Int J Artif Intell Tools. 2023; 32(08): 2350046. https://doi.org/10.1142/S021821302350046X
Downloads
Issue
Section
License
Copyright (c) 2024 Hussein Ahmed Ali, Walid Hariri, Nadia Smaoui Zghal, Dalenda Ben Aissa
This work is licensed under a Creative Commons Attribution 4.0 International License.