COVID-19 Diagnosis System using SimpNet Deep Model

: After the outbreak of COVID-19, immediately it converted from epidemic to pandemic. Radiologic images of CT and X-ray have been widely used to detect COVID-19 disease through observing infrahilar opacity in the lungs. Deep learning has gained popularity in diagnosing many health diseases including COVID-19 and its rapid spreading necessitates the adoption of deep learning in identifying COVID-19 cases. In this study, a deep learning model, based on some principles has been proposed for automatic detection of COVID-19 from X-ray images. The SimpNet architecture has been adopted in our study and trained with X-ray images. The model was evaluated on both binary (COVID-19 and No-findings) classification and multi-class (COVID-19, No-findings, and Pneumonia) classification tasks. Our model has achieved an accuracy value of 98.4% for binary and 93.8% for the multi-class classification. The number of parameters of our model is 11 Million parameters which are fewer than some state-of-the-art methods with achieving higher results.

can influence the severity of pneumonia such as elderliness, chronic diseases of asthma, and bronchitis. Also, immunocompromised people are more susceptible to be infected by pneumonia. Besides, pneumonia treatment relies on the organism that caused the infection, but usually, antibiotics, cough medicines, antipyretics, and analgesics are used as a treatment 4 . Depending on the symptoms and their health conditions, the patients are hospitalized and in extreme cases, they are admitted to ICU and use a mechanical ventilator to help them breathing 3 . High transmissibility and contagiousness of coronavirus between symptomatic and asymptomatic patients, making the disease quickly spreads and rising number of infected people. Moreover, the health sector's resources are limited such as the health workforce, the number of mechanical ventilators for patients that need it. Therefore, early diagnosing of the virus is a critical step in speeding down the pace of spreading the COVID-19 4 . The most typical symptoms of (SARS-CoV-2) include dry cough, fever, dyspnea, myalgia, and severe headache; less common symptoms include hemoptysis, diarrhea, and in severe cases leads to pneumonia. Additionally, some patients with poor health conditions are admitted to an intensive care unit (ICU) 3 .
One of the methods of diagnosing Covid-19 is through the reverse transcription-polymerase chain reaction (RT-PCR) technique which detects Covid-19 ribonucleic acid (RNA) from collected specimens of nasopharyngeal or oropharyngeal exudates. Although RT-PCR is considered the gold standard in diagnosing covid-19, it has some limitations such as low sensitivity in detecting the virus, it also experiences high false-negative rates 5 .
Machine learning techniques including deep learning are widely used in many health-related problems 6 . Several deep learning-based systems have been presented to detect COVID-19 infectious disease using X-ray images. In the current work, some principles have been followed which are provided by the authors of SimpNet 7 to provide COVID-19 identification in both binary and ternary cases. The SimpNet already surpassed some stateof-the-art approaches like DenseNet, VGGNet, and ResNet in image classification problems. Also, it has 2 to 50 fewer parameters than deeper architectures and attaining superior accuracy on defacto benchmark datasets. Additionally, the intuition behind opting SimpNet is avoiding the complexity of the model to improve accuracy and grouping layers of the Convolutional Neural Network differently from previous models. They presented some basic principles in designing efficient Convolutional Neural Network (CNN) architecture. Hasanpour et al. 7 conducted an extensive experiment for each principle that they claimed to be an essential principle to design an efficient CNN architecture. They also introduced a new layer which is called SAF pooling which is a max-pooling layer before the dropout layer. One of their base architectures has been picked and train from scratch as our dataset is entirely different from theirs. In the proposed work some principles have been followed including stacking similar layers to capture discriminative features, utilizing larger kernels instead of smaller ones at the early layers, and changing different policies instead of changing the CNN architecture. Also, dropout operation has been used after each Conv layer to improve accuracy and generalization, and Simple Adaptive Feature (SAF) pooling has been used before dropout operation. The main contribution of this study is two-fold which are utilizing a fewer number of parameters than state-of-the-art deep learning models, and deploy a set of principles to design a CNN model using less computational power for training the model.

Literature Review:
Since (RT-PCR) pitfalls in detecting covid-19 radiological images are preferable to slow down the paces of spreading out this pandemic. The faster and standard diagnosis tests via images for pneumonia identification are Chest X-Ray (CXR) and computed tomography (CT) scans. Compared to CT scans, CXR is less precise and has a higher misidentification rate. Despite that, CXR is still beneficial as it is faster and exposes less radiation. Moreover, some studies suggest the use of chest radiography for the screening of COVID-19 5,8 . Hence, imaging techniques are considered an alternative to the PCR method which experiences higher sensitivity to confirm the existence of the novel coronavirus. Nevertheless, confirming COVID-19 pneumonia through visual scanning is a difficult task for radiologists as they have to look for ground-glass opacity (GGO) in the lungs 8 . Therefore, deep learning has been suggested as an automated tool for radiologic imaging of the human system organs such as the chest, brain, and musculoskeletal system 9, 8 . One advantage of leveraging radiologic imaging for COVID-19 screening is the fast triaging of the suspected patients 5 . In this study, the aim here is to develop a deep learning model that requires less computational power and achieving higher accuracy than existing approaches.
After the outbreak of COVID-19, numerous studies have been suggested to design a customized Convolutional Neural Network (CNN) architecture either from a scratch or based on the existing ones which are called transfer learning. In the work of Apostolopoulos et al. 10 , the performance of some state-of-the-art CNN architectures has been evaluated through a transfer learning process. The evaluated architectures include VGG19 11 , MobileNet 12 , Inception 13 , Xception 14 , and Inception ResNet v2 15 . The best performance was found with VGG19 in which the accuracies are 98.75% and 93.48% for binary and multi-class classification, respectively. Inspired by the architecture proposed in 16 for image recognition, Wang et al. 5 designed a novel CNN architecture for COVID-19 detection from X-ray images. Because the available data was not sufficient to train the CNN, they applied the augmentation techniques of rotation, translation, flipping, and shifting to the data. They also produced a new dataset namely the COVIDx dataset which consists of 13,800 X-ray images. The efficiency of the method was evaluated with sensitivity and positive predictive value.
Similarly, Alom et al. 17 created a customized CNN architecture for classifying COVID-19, pneumonia, and Normal images. The base method for the approach is the work of Alom et al. 15 which was designed for object recognition. By applying some modifications to the original architecture, the achieved accuracies are 84.67% and 98.78% for X-ray and CT images, respectively. In the study by Ozturk et al. 18 a new CNN architecture was presented which is motivated by the DarkNet CNN structure of Redmon and Farhadi 19 . The method achieved an accuracy of 87.02% for multi-class and 98.08% for binary classifications on X-ray images. Motivated by Xception architecture 14 , Khan et al. 20 proposed a CoroNet CNN model for COVID-19 detection. The data of the model was validated with 4-fold cross-validation and it was examined with precision, recall, specificity, fmeasure, and accuracy. The accuracy value for 4class classification (bacterial pneumonia, viral pneumonia, COVID-19, and Normal) is 89.6%. The classification accuracy is 95% for 3-class and 99% for binary classifications. On the other hand, in the work of Toraman et al. 21 capsule network has been used to examine the classification of COVID-19, No-findings, and Pneumonia. They claim that the capsule network has effectively performed the classification with limited images in the dataset. The achieved accuracies with capsule network are 97.24% and 84.22% for both binary and multi-class classification tasks. Similarly, Khobahi et al. 22 presented an approach that works well with limited availability of data. They used semi-supervised learning and autoencoders for COVID-19 identification. The suggested architecture has 11.8 million parameters and achieved an accuracy of 93.5% for multi-class classification.

Materials and Methods:
Hasanpour et al. 7 . presented a set of principles to design an efficient CNN architecture and empirically showed the results on different baseline datasets. They introduced a new layer which is called "SAF-pooling" to amplify the generalization process and boost the discrimination power of the model. SAF-Pooling is a max-pooling layer that occurs before the dropout operation. The principles that they provide are gradually expanding the network, grouping homogenous layers rather than heterogeneous layers, avoiding 1x1 kernels in early layers, squeezing feature map size too much at ending layers, and downsampling in early layers. Also, they elaborated on changing the network policies rather than changing the network to be more complex such as regularization, and learning rate. And, to improve the accuracy and generalization dropout can be used with a small amount after each Conv layer. Figs 1, 2 illustrate a typical SimpNet architecture and the block diagram of the proposed system, respectively. The architecture that has been adopted with some modifications, consists of 10 Conv layers that are stacked on each other, batch normalization and dropout after each Conv, and max-pooling layer. The proposed model is trained from scratch as our dataset is different from the dataset that has been used by SimpNet authors. To update weights, an Adam optimizer with a learning rate of 0.01, and a categorical cross-entropy loss function have been utilized. The Batch normalization layer, Relu layer, and dropout layer are used after each Conv layer. The dropout layer is used with a small ratio of 0.2. Also, the batch size of 10 and patience of 10 have been used. Figure 3 shows the graphical representation of the proposed model. Dataset: The underlying dataset to evaluate our model is publicly available and originally collected from two different sources. The first was obtained by Cohen et al. 23 which contains X-ray images of COVID-19, formerly it was 127 X-ray images and continuously updated. In this dataset, all the information about the patients is not given, only the gender and the age of the patients are given. The second part of the dataset was developed by Wang et al. 24 which contains X-ray images of both normal and pneumonia disease. Ozturk et al. 18 randomly chose 500 images from both normal and pneumonia X-ray images to create the new dataset for researchers around the world. At present, the dataset of Ozturk et al. 18 that has been utilized in our model has 500 images in each class of COVID-19, Normal, and Pneumonia as three images of COVID-19, No-findings(Normal), and Pneumonia are shown in Fig. 4.

Preprocessing and Data Augmentation
The images of the dataset do not have the same dimension. Although some information might be lost, the images have been resized to 128x128 pixels to speed up the training process. It is evident that CNN requires a large amount of data for better results and to avoid overfitting during the training phase. So, the images have been augmented with some data augmentation strategies. The utilized data

Implementation Environment
The training is done on NVIDIA GeForce Titan X (GTX) 1050 GPU, and the machine having 16 GB of RAM and an Intel Core i7+ processor. For high-performance GPU acceleration, cuDNN has been used, it accelerates many deep learning frameworks, including Caffe, and TensorFlow.

Experimental Results
The experiments have been conducted on Xray images to detect COVID-19 through two various scenarios. In the first scenario, the model has been trained to classify the images into two classes: COVID-19 and Normal. In the second scenario, the proposed model is trained in three categories of classification tasks: COVID-19, Normal, and Pneumonia.

Performance Evaluation
The proposed approach has been evaluated via 5-fold cross-validation which means the experiments are conducted 5 times. The dataset has been divided into five parts; 4 for training the model and 1 for validation. In other words, 80% has been employed for training purposes and 20% is for validation during each fold, the training and validation parts will be changed as shown in Fig. 6. The training and validation loss for the first fold of the multi-class task with the validation accuracy is shown in Fig. 7. The average elapsed time for training the model in each fold is approximately 7 hours and 48 minutes for the three-way classification task. Additionally, to train the model for binary classification, the model required around 4 hours and 16 minutes on average. The trained model takes 36 seconds to be loaded and classify a given X-ray image.  The performance of the model was evaluated for both binary and multi-classification tasks for each fold. Fig 8 depicts the performance of the model for the 3-class classification task with the confusion matrix (CM) for each fold. The average classification accuracy is 93.8 for COVID-19, Normal, and Pneumonia classification. Tab. 1 shows the precision, recall, and f1-score for each fold for multi-class identification. It can be noted from Table 1 that the model obtained better precision, sensitivity, f1-score, and accuracy for fold-1 with the value of 95%, 94.66%, 94.66%, 95%, respectively. Tab. 2 represents an average accuracy of precision, recall, and f1-score for each class of COVID-19, normal, and pneumonia. It can be seen from Tab. 2 that the performance for the COVID-19 class is comparatively higher than normal and pneumonia classes. The performance metrics of precision, recall (sensitivity), f1-score, and accuracy are calculated as follows:   Fig. 9 depicts the accuracy and loss of training and validation sample images over 100 epochs for the first fold of the binary task. It can be observed from the right side figure that the difference between training and validation accuracy is not too much and the least difference reaches the last epoch. This indicates the absence of an overfitting problem due to utilizing the dropout layer after each Conv layer which makes the model generalizes well. Fig. 10 demonstrates the confusion matrices for 5-fold cross-validation of the binary classification.  The precision, recall, F1-Score, and accuracy of binary classification are shown in Tab. 3. It can be seen that the result for the first and the last folds are the same with 99% of accuracy for precision, recall, F1-score, and accuracy. Table 4 illustrates precision, recall, F1-score, and accuracy for each class of COVID-19 and Normal with noting that the COVID-19 class achieved higher accuracy than the Normal class.  Discussion: The proposed model achieved an accuracy of 93.8% and 98.4% for the multi-class and binary classification tasks, respectively. The achieved results are superior to other state-of-the-art studies. Table 5 shows the performance of other conducted studies in the literature compared to our approach. The underlying reasons for obtaining the achieved results include grouping similar layers, utilizing dropout after each Conv layer, and SAF pooling layer in the proposed model. Through using dropout after all Conv layers, overfitting of the model has been prevented, the problem that most CNN models suffer from. Also, the model can generalize well for unseen data. Another reason for the superiority of the proposed method is symmetrically stacking layers, as each group of similar layers is responsible for attaining a specific task. Large kernels also have been used in the early layers of the model to capture more valuable information. The proposed method outperforms the below-mentioned pre-trained deep learning models in terms of the number of parameters and obtained accuracy. Unlike other CNN models which group heterogeneous layers during designing the architecture, in the proposed model homogenous layers are grouped. Additionally, to provide better performance SAF pooling has been used rather than using maxpooling and average-pooling, and larger kernels have been employed in the early layers to get valuable features. Moreover, to keep the discriminability power of the model, downsampling in the early layers has been avoided. The model has 11.12 million parameters which is not too much compared to other models in the literature with achieving higher accuracy and requiring less computational power.  21 11.57 M 97.24 84.22 X-ray COVID_MTNet 17 34 M -------84.67 X-ray COVID-Net 5 11.74 M -------83.5 X-ray CoroNet 20 33.96 M 99 89.6 X-ray CoroNet2 22 11.8 M -------93.5 X-ray Our method 11.12 M 98.4 93.8 X-ray

Conclusion:
In this study, we have proposed a CNN-based model to perform binary and 3-class identification tasks of X-ray images for COVID-19 detection using SimpNet architecture. Our model has a fewer number of parameters than some state-of-the-art methods which makes the model less computationally expensive. Furthermore, the proposed model achieved an accuracy of 93.8% for multi-class and 98.4% for the binary classification task. Also, the model performance outperforms other customized deep learning models as explained through cross-validation and confusion matrices for each fold. To improve the performance of the proposed model various techniques can be used to avoid information lost during resizing the images before being given to the model.