•  
  •  
 

Abstract

Lung diseases have newly increased, with lung cancer being the most serious, as a result of the widespread use of electronic cigarettes and the high demand for them among youth. Misdiagnosis of this disease has resulted in a high death rate around the world, which further complicates matters. So the researchers have tended to concentrate on and study this illness. After extracting the crucial characteristics for this cancer's classification, several studies implemented multiple artificial intelligence algorithms with different techniques, nonetheless, the primary difference between those studies persisted in the classification accuracy. In this paper, after selecting the methods of pre-processing the data and extracting the critical features in addition to evaluating the strength of the associations between the data variables by applying the correlation coefficient method. The K-Nearest Neighbor (KNN) algorithm was implemented and tested after being enhanced by using the Random Forest (RF) algorithm to classify lung cancer diseases. The experimental results after collecting lung cancer data set from UCI's machine learning repository for diagnosing three types of cancers showed obtaining a competitive classification accuracy of higher than 97.4%. This competitive accuracy motivates us and other researchers to use the proposed system to diagnose other types of diseases.

Keywords

Artificial intelligence, Classification, Enhanced k-nearest neighbor classifier, Lung cancer, Outliers

Subject Area

Computer Science

Article Type

Article

First Page

3503

Last Page

3512

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS