Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images

Authors

  • Hussein Ahmed Ali Microwave Electronics Research Laboratory, Faculty of Sciences of Tunis, University Tunis El-Manar, Tunis El-Manar, Tunisia & College of Computer Science and Information Technology, University of Kirkuk, Kirkuk, Iraq. https://orcid.org/0009-0002-0780-2658
  • Walid Hariri Labged Laboratory, Department of Computer Science, Badji Mokhtar Annaba University, Annaba, Algeria.
  • Nadia Smaoui Zghal Control and Energy Management Laboratory, (CEM Lab) ENIS, University of Sfax Sfax, Tunisia.
  • Dalenda Ben Aissa Microwave Electronics Research Laboratory, Faculty of Sciences of Tunis, University Tunis El-Manar, Tunis El-Manar, Tunisia.

DOI:

https://doi.org/10.21123/bsj.2024.9422

Keywords:

Chest X-ray (CXR), COVID-19, Decision Tree, Gaussian Naïve, Stochastic Gradient Descent. Bayes, Machine Learning

Abstract

The rapid spread of the COVID-19 pandemic has strained global healthcare systems, necessitating efficient diagnostic methods. While Polymerase Chain Reaction (PCR) and antigen tests are common, they have limitations in speed and precision. Enhancing the accuracy of imaging techniques, especially Chest X-rays (CXR) and Computerized Tomography (CT) scans, is crucial for detecting COVID-19-related lung abnormalities. CXR, being cost-effective and accessible, is preferred over CT scans, but accurate diagnosis often requires technological support. To address this, an extensive dataset of CXR images categorized into five classes is available on Kaggle. Processing such data involves steps like grayscale conversion, image intensity adjustment, resizing, and feature extraction using Principal Component Analysis (PCA). Machine Learning (ML) techniques, including Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Gaussian Naive Bayes (GNB), and K-Nearest Neighbors (KNN), are employed for image classification. DT shows the highest accuracy at 88%, outperforming other models like GNB (77%), KNN (71%), SGD (70%), LR (74%), and RF (45%). It consistently excels across assessment metrics such as F1-score, sensitivity, and precision, with an 88% best-weighted average. However, selecting the optimal ML model depends on factors like dataset characteristics and implementation specifics. Thus, careful consideration of these factors is crucial when choosing an ML model for COVID-19 diagnosis via CXR image classification.

References

Kaheel H, Hussein A, Chehab A. AI-Based Image Processing for COVID-19 Detection in Chest CT Scan Images. Front Comms Net. 2021; 2(Aug):1–12. https://doi.org/10.3389/frcmn.2021.645040

Too J, Mirjalili S. A hyper learning binary dragonfly algorithm for feature selection: A COVID-19 case study. Knowl. Based Syst. 2021; 212: 106553. https://doi.org/10.1016/j.knosys.2020.106553

Shen C, Yu N, Cai S, Zhou J, Sheng J, Liu K, et al. Quantitative computed tomography analysis for stratifying the severity of Coronavirus Disease 2019. J Pharm Anal. 2020; 10(2): 123–9. https://doi.org/10.1016/j.jpha.2020.03.004

Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017; 542(7639): 115–8. https://doi.org/10.1038/nature21056

Soni M, Shnan MA. Scalable Neural Network Algorithms for High Dimensional Data. Mesopotamian J Big Data. 2023; 1–11. https://doi.org/10.58496/MJBD/2023/001

Adadi A, Lahmer M, Nasiri S. Artificial Intelligence and COVID-19: A Systematic umbrella review and roads ahead. J King Saud Univ Inf Sci. 2022; 34(8): 5898–920. https://doi.org/10.1016/j.jksuci.2021.07.010

Bachtiger P, Peters NS, Walsh SLF. Machine learning for COVID-19—asking the right questions. Lancet Digit Heal. 2020; 2(8): e391–2. https://doi.org/10.1016/s2589-7500(20)30162-x

Gupta VK, Gupta A, Kumar D, Sardana A. Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model. Big Data Min Anal. 2021; 4(2): 116–23. https://doi.org/10.26599/bdma.2020.9020016

Shaikh F, Andersen MB, Sohail MR, Mulero F, Awan O, Dupont-Roettger D, et al. Current landscape of imaging and the potential role for artificial intelligence in the management of COVID-19. Curr Probl Diagn Radiol. 2021; 50(3): 430–5. https://doi.org/10.1067/j.cpradiol.2020.06.009

Elmokadem AH, Mounir AM, Ramadan ZA, Elsedeiq M, Saleh GA. Comparison of chest CT severity scoring systems for COVID-19. Eur Radiol. 2022; 1–12. https://doi.org/10.1007/s00330-021-08432-5

Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, et al. COVID-19 patient health prediction using boosted random forest algorithm. Front public Heal. 2020; 8: 357. https://doi.org/10.46632/daai/3/2/13

Abbood EA, Al-Assadi TA. GLCMs Based multi-inputs 1D CNN Deep Learning Neural Network for COVID-19 Texture Feature Extraction and Classification. Karbala Int J Mod Sci. 2022; 8(1): 28–39. https://doi.org/10.33640/2405-609x.3201

Ufuk F, Demirci M, Uğurlu E, Çetin N, Yiğit N, Sarı T. Evaluation of disease severity with quantitative chest CT in COVID-19 patients. Diagn Interv Radiol. 2021; 27(2): 164. https://doi.org/10.5152/dir.2020.20281

Kareem OS. Face Mask Detection Using Haar Cascades Classifier To Reduce The Risk Of Coved-19. Int J Math Stat Comput Sci. 2024; 2: 19–27. https://doi.org/10.59543/ijmscs.v2i.7845

Rasheed J, Hameed AA, Djeddi C, Jamil A, Al-Turjman F. A machine learning-based framework for diagnosis of COVID-19 from chest X-ray images. Interdiscip Sci Comput Life Sci. 2021 Mar 1; 13(1): 103–17. https://doi.org/10.1007/s12539-020-00403-6

Gouda W, Almurafeh M, Humayun M, Jhanjhi NZ. Detection of COVID-19 based on chest X-rays using deep learning. In: Healthcare. 2022; 10(2): 343. https://doi.org/10.3390/healthcare10020343

Erdaw Y, Tachbele E. Machine learning model applied on chest X-ray images enables automatic detection of COVID-19 cases with high accuracy. Int J Gen Med. 2021; 14: 4923–31. https://doi.org/10.2147/IJGM.S325609

Ahmad HK, Milne MR, Buchlak QD, Ektas N, Sanderson G, Chamtie H, et al. Machine Learning Augmented Interpretation of Chest X-rays: A Systematic Review. Diagnostics. 2023; 13(4): 743. https://doi.org/10.3390/diagnostics13040743

Kwekha-Rashid AS, Abduljabbar HN, Alhayani B. Coronavirus disease (COVID-19) cases analysis using machine-learning applications. Appl Nanosci . 2023; 13(3): 2013–25. https://doi.org/10.1007/s13204-021-01868-7

Heidari A, Jafari Navimipour N, Unal M, Toumaj S. Machine learning applications for COVID-19 outbreak management. Neural Comput Appl. 2022; 34(18): 15313–15348. https://doi.org/10.1007/s00521-022-07424-w

Akhtar A, Akhtar S, Bakhtawar B, Kashif AA, Aziz N, Javeid MS. COVID-19 Detection from CBC using Machine Learning Techniques. Int J Technol Innov Manag. 2021; 1(2): 65–78. https://doi.org/10.54489/ijtim.v1i2.22

Khoshbakhtian F, Ashraf AB, Khan SS. Covidomaly: A deep convolutional autoencoder approach for detecting early cases of covid-19. arXiv Prepr arXiv201002814. 2020.

Mahesh B. Machine Learning Algorithms. A Review. Int J Sci Res. 2020; 9(1): 381–6. http://dx.doi.org/10.21275/ART20203995

Zaki SM, Jaber MM, Kashmoola MA. Diagnosing COVID-19 Infection in Chest X-Ray Images Using Neural Network. Baghdad Sci J. 2022; 19(6): 1356–61. https://doi.org/10.21123/bsj.2022.5965

Eljamassi DF, Maghari AY. COVID-19 Detection from Chest X-ray Scans using Machine Learning. Proc 2020. Int Conf Promis Electron Technol. ICPET 2020. 2020; 1–4. https://doi.org/10.1109/ICPET51420.2020.00009

Samsir S, Sitorus JHP, Ritonga Z, Nasution FA, Watrianthos R. Comparison of machine learning algorithms for chest X-ray image COVID-19 classification. J Phys Conf Ser.. 2021; 1933(1): 012040. https://doi.org/10.1088/1742-6596/1933/1/012040

Mohammad-Rahimi H, Nadimi M, Ghalyanchi-Langeroudi A, Taheri M, Ghafouri-Fard S. Application of machine learning in diagnosis of COVID-19 through X-ray and CT images: a scoping review. Front Cardiovasc Med. 2021; 8: 638011. https://doi.org/10.3389/fcvm.2021.638011

Zargari Khuzani A, Heidari M, Shariati SA. COVID-Classifier: An automated machine learning model to assist in the diagnosis of COVID-19 infection in chest x-ray images. Sci Rep. 2021; 11(1): 9887. https://doi.org/10.1038/s41598-021-88807-2

Johri S, Goyal M, Jain S, Baranwal M, Kumar V, Upadhyay R. A novel machine learning‐based analytical framework for automatic detection of COVID‐19 using chest X‐ray images. Int J Imaging Syst Technol. 2021; 31(3): 1105–19. https://doi.org/10.1002/ima.22613

Alafif T, Tehame AM, Bajaba S, Barnawi A, Zia S. Machine and deep learning towards COVID-19 diagnosis and treatment: survey, challenges, and future directions. Int J Environ Res Public Health. 2021; 18(3): 1117. https://doi.org/10.3390/ijerph18031117

Cavallo AU. Texture Analysis in the Evaluation of COVID-19 Pneumonia in Chest X-Ray Images: A Proof of Concept Study. Curr Med Imaging Rev. 2022; 17(9): 1094–102. https://doi.org /10.2174/1573405617999210112195450

Ahmed Ali H, Hariri W, Smaoui Zghal N, Ben Aissa D. A Comparison of Machine Learning Methods for best Accuracy COVID-19 Diagnosis Using Chest X-Ray Images. 2022 IEEE 9th Int Conf Sci Electron Technol Inf Telecommun SETIT 2022. 2022; (Ml): 349–55. https://doi.org/ 10.1109/SETIT54465.2022.9875477

Arif ZH, Cengiz K. Severity Classification for COVID-19 Infections based on Lasso-Logistic Regression Model. Int J Math Stat Comput Sci. 2022; 1: 25–32. https://doi.org/10.59543/ijmscs.v1i.7715

Dara OA, Lopez-Guede JM, Raheem HI, Rahebi J, Zulueta E, Fernandez-Gamiz U. Alzheimer’s Disease Diagnosis Using Machine Learning: A Survey. Appl Sci. 2023; 13(14): 8298. https://doi.org/10.3390/app13148298

Sheikh BU, Zafar A. Robust Medical Diagnosis: A Novel Two-Phase Deep Learning Framework for Adversarial Proof Disease Detection in Radiology Images. J Imaging Inform Med. 2024; 37(1): 308–338. https://doi.org/10.1007/s10278-023-00916-8

Peng T, Wang Y, Xu TC, Chen X. Segmentation of lung in chest radiographs using hull and closed polygonal line method. IEEE Access. 2019; 7: 137794–810. https://doi.org/10.1109/access.2019.2941511

Attallah O. A deep learning-based diagnostic tool for identifying various diseases via facial images. Digit Heal. 2022; 8: 20552076221124430. https://doi.org/10.1177/20552076221124432

Zaman A, Khattak SS, Hassan Z. Medical Imaging for the Detection of Tuberculosis Using Chest Radio Graphs. In: 2019 International Conference on Advances in the Emerging Computing Technologies (AECT). IEEE. 2020; 1–5. https://doi.org/10.1109/aect47998.2020.9194212

Issarti I, Consejo A, Jiménez-García M, Hershko S, Koppen C, Rozema JJ. Computer aided diagnosis for suspect keratoconus detection. Comput Biol Med. 2019; 109: 33–42. https://doi.org/10.1016/j.compbiomed.2019.04.024

Alavijeh FS, Mahdavi-Nasab H. Multi-scale morphological image enhancement of chest radiographs by a hybrid scheme. J Med Signals Sens. 2015; 5(1):5 9. PMID: 25709942

Rim B, Kim J, Hong M. Gender classification from fingerprint-images using deep learning approach. In: Proceedings of the international conference on research in adaptive and convergent systems. 2020; 7–12. https://doi.org/10.1145/3400286.3418237

Phung VH, Rhee EJ. A high-accuracy model average ensemble of convolutional neural networks for classification of cloud image patches on small datasets. Appl Sci. 2019; 9(21): 4500. https://doi.org/10.3390/app9214500

Mhawi DN, Hashem SH. Proposed hybrid correlation feature selection forest panalized attribute approach to advance IDSs. Mod Sci. 2021; 7: 15. https://doi.org/10.33640/2405-609x.3166

Ebied HM. Feature extraction using PCA and Kernel-PCA for face recognition. In: 2012 8th International Conference on Informatics and Systems (INFOS). IEEE. 2012. MM–72.

Karamizadeh S, Abdullah SM, Manaf AA, Zamani M, Hooman A. An overview of principal component analysis. J Signal Inf Process. 2013;4(3B):173. https://doi.org/10.4236/jsip.2013.43B031

Poon B, Amin MA, Yan H. PCA based human face recognition with improved method for distorted images due to facial makeup. In: Proceedings of the international multi conference of engineers and computer scientists, Hong Kong. 2017.

Reza MS, Ma J. ICA and PCA integrated feature extraction for classification. In: 2016 IEEE 13th International Conference on Signal Processing (ICSP). IEEE. 2016; 1083–8. https://doi.org/10.1109/icsp.2016.7877996

Charbuty B, Abdulazeez A. Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends. 2021; 2(01): 20–8. https://doi.org/10.38094/jastt20165

Navada A, Ansari AN, Patil S, Sonkamble BA. Overview of use of decision tree algorithms in machine learning. Pro IEEE Control Syst Grad Res Colloquium, ICSGRC. 2011; 37–42. https://doi.org/10.1109/ICSGRC.2011.5991826

Breiman L. Random forests. Mach Learn. 2001; 45: 5–32. https://doi.org/10.1023/A:1010933404324

Bottou L. Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers. Springer. 2010; p. 177–86. https://doi.org/10.1007/978-3-7908-2604-316

Emon MU, Islam R, Keya MS, Zannat R. Performance Analysis of Chronic Kidney Disease through Machine Learning Approaches. In: 2021 6th International Conference on Inventive Computation Technologies (ICICT). IEEE. 2021; 713–9. https://doi.org/10.1109/ICICT50816.2021.9358491

Maalouf M. Logistic regression in data analysis: an overview. Int J Data Anal Tech Strateg. 2011;3(3):281–99. https://doi.org/10.1504/IJDATS.2011.041335

Gupta H V, Kling H, Yilmaz KK, Martinez GF. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J Hydrol. 2009; 377(1–2): 80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003

Romadhon MR, Kurniawan F. A comparison of naive Bayes methods, logistic regression and KNN for predicting healing of Covid-19 patients in Indonesia. In: 3rd east Indonesia conference on computer and information technology (eiconcit). IEEE. 2021; 41–4. https://doi.org/10.1109/EIConCIT50028.2021.9431845

Zhang M-L, Zhou Z-H. A k-nearest neighbor based algorithm for multi-label classification. In: IEEE international conference on granular computing. IEEE. 2005; 718–21. https://doi.org/10.1109/GRC.2005.1547385

Ontivero-Ortega M, Lage-Castellanos A, Valente G, Goebel R, Valdes-Sosa M. Fast Gaussian Naïve Bayes for searchlight classification analysis. Neuroimage. 2017; 163: 471–9. https://doi.org/10.1016/j.neuroimage.2017.09.001

Jahromi AH, Taheri M. A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features. In: Artificial intelligence and signal processing conference (AISP). IEEE. 2017; 209–12. https://doi.org/10.1109/AISP.2017.8324083

Haghighi S, Jasemi M, Hessabi S, Zolanvari A. PyCM: Multiclass confusion matrix library in Python. J Open Source Softw. 2018; 3(25): 729. https://doi.org/10.21105/joss.00729

Ali HA, Zghal NS, Hariri W, Aissa D Ben. Fast Hybrid Deep Neural Network for Diagnosis of COVID-19 using Chest X-Ray Images. Int J Adv Comput Sci Appl. 2023; 14(3): 553–64. https://doi.org/10.14569/ijacsa.2023.0140364

Haouli I-E, Hariri W, Seridi-Bouchelaghem H. COVID-Attention: Efficient COVID19 Detection using Pre-trained Deep Models Based on Vision Transformers and X-ray Images. Int J Artif Intell Tools. 2023; 32(08): 2350046. https://doi.org/10.1142/S021821302350046X

Downloads

Issue

Section

article

How to Cite

1.
Comparing PCA-Based Machine Learning Algorithms for COVID-19 Classification Using Chest X-ray Images. Baghdad Sci.J [Internet]. [cited 2024 Sep. 19];22(3). Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/9422