Exploring Important Factors in Predicting Heart Disease Based on Ensemble-Extra Feature Selection Approach

complicating


Introduction
The Electronic Health Record (EHR) technique has revolutionized the healthcare industry, generating an abundance of clinical data [1][2][3] .However, this wealth of information poses a significant challenge for disease prediction, as the sheer volume of data and associated features can be overwhelming 4 .This is particularly true for time-sensitive tasks, such as predicting mortality.The healthcare domain is home to numerous medical databases storing vast clinical records, but not all of these are relevant to the predictive task at hand.With the rise of big data and the proliferation of medical records, decisionmaking based on multiple features has become increasingly complex, with redundant and irrelevant attributes complicating matters [5][6][7][8] .These extraneous features introduce noise, leading to inaccuracies in

Abstract
Heart disease is a significant and impactful health condition that ranks as the leading cause of death in many countries.In order to aid physicians in diagnosing cardiovascular diseases, clinical datasets are available for reference.However, with the rise of big data and medical datasets, it has become increasingly challenging for medical practitioners to accurately predict heart disease due to the abundance of unrelated and redundant features that hinder computational complexity and accuracy.As such, this study aims to identify the most discriminative features within high-dimensional datasets while minimizing complexity and improving accuracy through an Extra Tree feature selection based technique.The work study assesses the efficacy of several classification algorithms on four reputable datasets, using both the full features set and the reduced features subset selected through the proposed method.The results show that the feature selection technique achieves outstanding classification accuracy, precision, and recall, with an impressive 97% accuracy when used with the Extra Tree classifier algorithm.The research reveals the promising potential of the feature selection method for improving classifier accuracy by focusing on the most informative features and simultaneously decreasing computational burden.predictions, and increase computational overhead. 9,10 .
Cardiovascular disease datasets are of significant concern, given the high mortality rates associated with heart disease worldwide 8, 9, and 11 .According to the World Health Organization (WHO), the mortality rate of cardiovascular disease is estimated to approach nearly 30 million by the year 2040, as indicated in the studies 12 and 13 .However, diagnosing the disease using extensive datasets is increasingly difficult due to the vast number of features and data samples.Data mining methods offer an efficient means to identify individuals at higher risk of heart disease early on and enhance prediction accuracy.In particular, feature selection methods hold great promise in extracting essential features and discerning patterns from complex healthcare datasets 14, 15   .Additionally, feature selection improves model performance by reducing complexity and increasing prediction accuracy, which is critical in medical diagnosis 16 .Various feature selection methods, such as filter, wrapper, and embedded approaches, have emerged to address the dimensionality challenge [17][18][19] .
An interesting study conducted by 19 employed a variety of feature selection techniques, including principal component analysis, Chi-squared testing (χ2) statistical, relief, and symmetrical, to identify the best feature subsets for predicting heart disease.However, it is worth noting that the experiment was conducted using only one dataset.Furthermore, the chi-square method evaluates features based solely on their association with the class label, which may result in reduced effectiveness when interpreting classifier models.These limitations could potentially cause delays in the learning process and lead to a decline in accuracy performance.Thus, this study utilizes the ensemble-based extra tree feature selection method to identify the most influential features in heart disease prediction.This groundbreaking research holds the potential to significantly improve diagnosis and pave the way for better treatment strategies 20 .The proposed method uses an extra tree classifier's built-in feature importance functionality to determine the relevance of features to the primary classifier.This functionality evaluates the individual significance of each feature in making predictions within the model.This assessment is made by calculating the Gini Index as a measure of feature separation in the data.Features are then ranked based on their relative importance with respect to the class label.Features with higher scores are considered more critical, whereas those with lower scores are less influential in relation to the class target 20 which are more effective and informative features.The significant contributions of this study are outlined as follows: 1. To execute an optimal feature selection model, an ensemble-based extra tree method will be used for optimal feature selection.This method will extract the attributes that possess the highest level of information to enhance the accuracy of predictions and decrease the cost associated with complexity overhead.2. To build classification models based on a 10cross-validation approach with the selected features and the full features of the datasets.3. To investigate the impact of the proposed method on the classification performance by comparing the results of performance metrics of the classification models with selected and full features of heart disease datasets used in this study.
The remainder of the paper is organized as follows: Section 2 provides background information, while Section 3 outlines the research methodology.Section 4 presents the research results, followed by Section 5, which includes discussion.Finally, Section 6 concludes with a summary of findings and future research directions.The focus is on creating an optimal feature selection model that leverages an ensemble of extra trees to extract the most informative attributes.This aims to enhance prediction accuracy while minimizing the complexity overhead cost.worldwide.Heart attacks and strokes are the leading causes of this deadly disease 8 .

Related Work
Numerous studies have employed machine learning techniques to diagnose heart diseases with promising results 21 .One such study by done 21 utilized an ensemble of classifiers to predict heart disease using the Cleveland heart dataset from the UCI machine learning repository.Their experiment demonstrated a significant increase in prediction accuracy.
The author in 22 investigated the benefit of using machine learning in predicting heart disease by implementing the improved logistic regression classification model to predict if the patient has heart disease or not.
The study conducted by 23  The explosion of medical datasets has yielded highdimensional data, posing complexities for effective analysis and interpretation, which can cause processing delays and a decrease in the model's classification performance.This is largely due to the presence of irrelevant and redundant features 8 .As a result, many researchers have integrated feature selection methods into the classification process to identify the most impactful features within datasets that influence disease outcomes.By selecting these features, computational overhead can be reduced and accuracy can be enhanced.
For example, in the study of 18 , the researchers developed a state-of-the-art predictive system for diagnosing heart disease that combined machine learning and artificial intelligence.This system incorporated seven different classifier algorithms, including logistic regression, K-NN, ANN, SVM, NB, DT, and random forest, along with three feature selection techniques: Relief, mRMR, and LASSO.These techniques were used to identify the most important features for predicting heart disease.The results of their study showed that using the Relief algorithm significantly improved the performance of the logistic regression classifier, achieving an impressive accuracy rate of 89 %.However, it should be noted that the researchers only used one dataset to assess the benefits of feature selection.
In another investigation conducted by 19  It is of utmost importance to underscore the fact that this investigation solely utilized a solitary dataset in order to examine the impact of feature selection on the performance of classification.
In a study investigating heart stroke risk assessment, the author 24 proposed a novel feature selection method named "weighting-and-ranking-based hybrid feature selection" (WRHFS).This technique integrates multiple filter-based approaches, like Information Gain, Fisher score, and standard deviation, to score and rank features.Leveraging prior knowledge, WRHFS identified 9 crucial features out of 28 features for predicting stroke risk.Nevertheless, the approach of information gain (IG) chooses features by considering their significance to the target class, while avoiding the need for classifier models, potentially rendering these features less effective in interpreting the classifier models as highlighted by the study of 25 .
The work done by 26  features related to class target in diagnosing heart disease.The result of their experiment shows that accuracy rate increased from 85.29% to 89.7% with the proposed method and SVM classifier algorithm.However, the chi-Square methodology chooses characteristics by considering their connection to the class label, without resorting to classifier models.Consequently, these characteristics do not possess superior effectiveness in deciphering the classifier models 25 .
An evaluation of the impact of feature selection on machine learning models for heart disease prediction was conducted by the study of 8 .The ANOVA-F test was employed to select the most critical features from the dataset, with the goal of enhancing prediction accuracy.The experimental results revealed that employing feature selection techniques led to improved performance in machine learning models compared to models utilizing the entire feature set.This not only reduced computational complexities but also enhanced the accuracy of prediction models.
In a related study 7 , an ensemble classification model based on a feature selection approach was developed to identify the most relevant features related to the target class.The proposed model achieved an impressive accuracy rate of 97.57% on the datasets they considered.Their findings demonstrated that the utilization of feature selection approaches notably enhanced the performance of the classification model.
The aforementioned findings serve to underscore the efficacy of employing feature selection techniques in augmenting the efficacy of distinct classification algorithms in the context of diagnosing heart disease.They address issues related to noisy features and dependencies within the heart disease dataset that can influence the diagnostic process.
In this paper, we propose the utilization of an ensemble extra tree algorithm feature selection based to select a subset of features for training machine learning algorithms on the datasets to diagnose patients with heart disease.Selecting features with decision tree-based methods is notably quicker and more straightforward when contrasted with techniques like Fisher's score and F-score.A significant drawback of Fisher's score and F-score is their independent feature score calculation, lacking mutual information consideration among features 27 .
In contrast, the Extra Trees classifier evaluates all features collectively when categorizing data.This approach acknowledges the potential for certain feature combinations to outperform high-scoring individual features 28,29 , which is why the Extra Trees classifier is used as a feature selector in this study.

Experimental Design
In this study, datasets related to heart disease were obtained from various sources through the Google dataset tool.The acquired datasets underwent two key pre-processing steps: (i) data cleaning, addressing missing and erroneous values; and (ii) data normalization, facilitating optimal performance of machine learning models.The extra tree algorithm was employed for feature selection, extracting the most crucial features relevant to the class target.Subsequently, machine learning models were trained using the selected feature subsets alongside the complete feature dataset.
Evaluation of all classifier models was conducted utilizing diverse evaluation metrics through a 10-fold cross-validation approach, where the dataset was divided into 10 groups.The model was trained using nine of these groups, with the remaining group serving for performance evaluation.This process was iterated 10 times, each time using a different group as the test set during the 10-fold crossvalidation.All experiments were implemented using Python and scikit-learn libraries.Fig. 1 provides an illustration of the experiment phases, detailed in subsequent sections.

Datasets and Features
Four datasets are collected randomly from the Google dataset website.Table 1 presents a description of the datasets used in this study.The first dataset is called Heart-disease and contains 303 samples and 14 attributes (age, sex, cp, trestbps, chol, fbs, restecg, thalach, exang, oldpeak, slope, ca, thal, num).The last attribute (num or class label ('target') represents the output of predation in which the presence of heart disease is indicated by 1 and the absence of heart disease is denoted by 0. The dataset was used in the study of 30 and is available on the Kaggle website.The dataset contains 76 attributes, but most previous studies used a subset of 14 features.The explanation of the features for all datasets used in this study is illustrated in detail in Table 2 below 18 .Some samples of this dataset and its attributes are displayed in Fig. 2 below.

Troponin
The level of protein found in the muscles of heart.

Numerical
All the datasets were preprocessed by dealing with missing values and duplicated samples.However, all datasets do not have any missing values or duplicated samples.

The Feature Selection Approach with an Extra Tree Classifier
In this study, an ensemble-based Extra Tree Classifier feature selection method was applied to all datasets used to identify the most discriminative features associated with the target class.Subsequently, new datasets were generated based on the subset of features obtained through the Extra Tree (EX) algorithm approach.Seven different classifier algorithms were employed to build classifier models using these feature subsets.Additionally, classifier models were constructed using both feature subsets and datasets containing all available features, prior to applying the proposed approach.The study involved a comparative analysis between the feature subset dataset selected using the EX method and the dataset containing all features to identify the impact of the proposed method on the classification performance.
To evaluate the performance of these classifiers, a 10-fold cross-validation approach was utilized by assigning k with 10 to avoid overfitting issues when evaluating the effectiveness of the classification models.The data is initially separated into 10 parts in which 9 folds are utilized to train the model and the rest one fold is utilized for the testing intention which means that the classifiers were each executed 10 times to ensure that every portion of a split dataset was seen.The specific classification algorithms used will be discussed in detail in the following section.
The Extra Tree Classifier is the abbreviation of extremely randomized trees and is a type of ensemble learning technique that combines the outcomes of multiple decision trees, each of which is constructed from the original training data without subsampling or replacement.It should be noted that this differs from the Random Forest classifier, which employs bootstrap replicas and subsamples the input data with replacement during tree construction.The method uses the "feature importance" of an Extra Tree Classifier to select the relevant features 31 .It calculates impurity-based feature importance, enabling the selection of relevant features while discarding irrelevant ones.Each feature in the dataset is assigned a score between (0 and 1) through feature importance.Higher scores indicate greater relevance to the output variable.This score facilitates the identification of the most important features for model development.Feature importance is a built-in feature in tree-based classifiers and employs a metaestimator that fits randomized decision trees (also known as extra trees) on various data subsamples.
Averaging is used to enhance predictive accuracy and control overfitting while computing feature importance, which is subsequently employed for feature selection.The decision tree carefully examines its options and picks the most impactful feature within the subset.This feature then guides the data split, using the Gini Index to ensure the most informative separation.Averaging is then used to determine important features that contribute to reducing estimate variance and, consequently, are employed in feature selection.Features are ranked in descending order based on their Gini Importance scores, with higher scores signifying greater importance.To select a specific minimum number of top features (e.g., the top 5 features), the "largest (n)" function is employed, where "n" is set to the desired number of features.The extra tree algorithm is used for implementing feature selection in this study for the following reasons:  Lowering variance: Extra-Tree has a lower variance when comparing to other https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal algorithms due to the averaging of ensemble trees 31 . Reducing computational complexity and scalability: The algorithm is able to deal with very large-scale data mining applications of a huge number of features and training samples.The algorithm gains speed by using the same procedure repeatedly but sacrifices accuracy by choosing random, non-optimal split points 32 & 33 . Salient towards non-relevant and redundant inputs: The tree in extra tree algorithm is built to be robust towards non-relevant attributes variables as long as the number of randomized trees is sufficiently large according to the number of samples [32][33][34] .

The Significant Feature Subsets Selected
As previously mentioned, the selection of features was based on analyzing the feature importance property.An extra tree classifier was used to assign scores to input features that were selected using a predictive model that computed the Gini Index.This determined which features were most relevant to the class label, and the ones with highest scores were deemed the most important while those with the lowest values were considered the least important.
The figure below (Fig. 6) highlights the features that hold the most weight in determining the presence or absence of heart disease based on the available dataset (Heart-disease dataset).

Figure 6. Features subset selected for Heartdisease dataset
As displayed in Fig. 6 above, the number of major vessels colored by the fluoros-copy (ca) attribute represents the most prominent feature that is related to the class object with a score of 0.1258.Chest pain type trestbps attribute comes in the second rank with the weight of 0.1204 followed by exang and Thal features.Oldpeak which represents (ST depression induced by exercise relative to rest) is the least important feature since ranked the last feature.The study done by 9 ; declared that the chest pain feature represents the most significant feature in predicting heart disease.
The significant features selected for the second dataset are demonstrated in Fig. 7 below.

Figure 7. Features subset selected for the second dataset (Heart Disease Prediction)
As illustrated in Fig. 7 above, Thalium has the highest correlation to the possibility of heart disease.
As reported by the study of 35 , people with Thalium of value 2 are more likely to have heart disease.The number of vessels (CA) is one-factor causing heart disease; people with the lowest value of CA are more likely to have heart disease.Chest pain comes in third rank and as explained by 35 & 36 , a serious chest pain condition is considered a significant symptom that causes heart disease.MaxHR which represents the maximum heart rate achieved (Values between 60 and 202); comes in the fourth rank.
According to Fig. 6 and Fig. 7, Oldpeak or ST depression caused by exercise relative to rest feature ranked last in the top of the selected features.Oldpeak attribute ranked in the last list of significant https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal features selected by the study of 9 .Stress or ST depression has a negative effect on a person's heart health which can lead to high blood pressure, arterial damage, irregular heart rhythms, and a weakened immune system 35 .

Figure 8. Features subset selected for the third dataset (Heart Disease)
Comparing the selected features of the first and Heart Disease datasets as shown in Fig. 6 and Fig. 8 above, some of the optimal selected features are the same since both of the datasets have the same features with different sizes of samples.As observed in Fig. 8 above, chest pain (cp) represents the most important factor in diagnosing patients with heart disease.As reported by 35 , chest pain is considered one of the most common symptoms causing a heart attack.
Figure 9 showcases the results of applying the suggested method to the medical dataset.From the initial nine features, the method identified troponin, kcm, age, glucose, and high pressure as the most significant ones.The feature (troponin test) measures the levels of troponin; a type of protein primarily found in the heart muscles, within the bloodstream.Under normal circumstances, troponin is not typically detected in the blood.However, when there is damage to the heart muscles, troponin is released into the bloodstream.The extent of heart damage corresponds to the amount of troponin released.As per the findings of 37 , troponin emerged as an independent predictor of cardiovascular-related mortality, myocardial infarction, or stroke in patients afflicted with both type 2 diabetes and stable ischemic heart disease.Additionally, the study done by 38 conducted an experiment demonstrating that troponin was one of the predictor variables associated with heart disease, in alignment with clinical practice and existing literature.

Machine Learning Algorithms
To assess the effectiveness of the extracted features obtained through the suggested technique, diverse classifier models have been constructed.These models encompass Support Vector Machine (SVM), k-Neighbors Classifier (KNN), Extra Tree (EX), Naïve Bayes (NB), Linear Discriminant Analysis (LDA), Multilayer Perceptron (MLP), and Logistic Regression (LR).These classifier algorithms have been used in several studies to diagnose patient with heart disease [39][40][41] .The evaluation process employed a 10-fold cross-validation, where each classifier was executed 10 times to ensure comprehensive coverage of all sections within the split dataset, as outlined by

42
. The performance of these classifiers is then compared against two benchmarks: the best features selected using the proposed method and the full features set.To implement the classification models, the following parameters are used:  Kernel: The kernel serves to specify the type of kernel that is to be employed in the algorithm.A variety of kernel types exist, including, but not limited to, "linear," "poly," "rbf," "sigmoid," and "precomputed."For the purposes of this study, the linear kernel has been selected.

For KNN:
 N_neighbors: n_neighbors represent the number of neighbors to be used by default for k neighbors queries, in this experiment, parameter value 7 is used.
For NB:  There are different types of NB, in this study, Gaussian NB is used with default parameters.
For MLP:  Hidden_layer_sizes:Hidden_layer_sizes represents the number of neurons in the ith hidden layer, in this study, value 10 is used for Hidden_layer_sizes.
For the other classifier algorithms, the default parameters are used.

Evaluation Metric
To evaluate performance of the ML models, the following evaluation criteria are used:  Accuracy: The metric reflecting the success rate of a classification model, measured as the ratio of correct predictions (true positives and true negatives) to the total number of predictions.

Accuracy = (TP+TN / TP+TN+FP+FN) 1
The true positive (TP) in this study represents the patient with heart disease correctly classified.And the false positive (FP) represents the patient with heart disease wrongly identified as the patient without heart disease.The true negative (TN) indicates the patient who has no heart disease correctly classified.And the false negative (FN) represents the patient who has no heart disease wrongly classified as a patient with heart disease.
 Precision: In the context of classification, precision refers to the proportion of correctly identified positive instances relative to all instances that the classifier has identified as positive.It is an important metric that evaluates the accuracy of a classifier's positive predictions.A high precision score indicates that the classifier has a low rate of false positives, which is desirable in many business and academic settings.Precision is a key performance indicator for classifiers and plays a vital role in evaluating their efficacy.Precision = (TP / TP+FP) 2  Recall: In the context of classification, recall is defined as the ratio of true positive instances to the total number of actual positive instances.In other words, it refers to the proportion of positive cases that are correctly identified by the classifier out of all the instances that are actually positive.This metric is an important measure of the effectiveness of a classifier in identifying relevant instances, particularly in domains where false negatives can have serious consequences.Therefore, it is crucial to evaluate the recall rate of a classifier along with other performance metrics, such as precision and F1 score, to obtain a comprehensive understanding of its performance.
Recall= (TP / TP+FN) 3 All the performance metrics are averaged since the experiment was carried out using 10 fold crossvalidation method.

Results and Discussion
The Classification Results This section provides a detailed explanation of the experiment outcomes.The performance results of the machine learning algorithms tested on the final feature subsets obtained through the extra tree feature selection proposed approach, based on all features used in this study, are presented in the Table.
Additionally, the Table displays the performance results of the tested ML models on the datasets with https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal full feature subsets, to evaluate the effectiveness of the proposed feature selection method on the classification performance.The classification performance results for the heart disease dataset can be found in Table 3 below.
According to the data in Table 3, the MLP classifier performed the best when utilizing our proposed feature selection method, achieving an impressive 85% precision, recall, and accuracy rate for each metric.This represents a significant improvement compared to not using the feature selection technique, where all three metrics were at 74%.Additionally, the KNN classifier also showed enhanced results with our feature selection approach, reaching 83% precision, recall, and accuracy for each metric compared to the full feature results of 67%.These findings demonstrate the effectiveness of feature selection algorithms in accurately classifying heart disease with fewer relevant features, ultimately improving the overall classification performance.Nonetheless, the other classifiers demonstrated comparable or nearcomparable classification performance utilizing the chosen features in comparison to the complete features set of the datasets employed.These results confirm that utilizing a reduced number of crucial features is preferable to utilizing all features in order to minimize complexity overhead.that machine learning models outperform models that use the entire feature set, underscoring how this feature selection approach enhances prediction model accuracy while reducing the feature space's dimension.However, the other classifiers showed similar or nearly identical classification performance when using the selected features in comparison to the full feature set of the datasets.These outcomes validate the preference for using a smaller set of essential features to reduce complexity and computational overhead.Table 5 showcases the results of the experiment on the Heart Disease Dataset.The KNN classifier displayed superior performance in precision, recall, and accuracy, achieving a remarkable 81% for each metric when utilizing the reduced set of features (including chest pain (cp), the number of vessels colored by fluoroscopy (CA), Thal, thalach, and exang) as recommended by our method.In contrast, when using the full set of features, the KNN classifier exhibited the lowest performance metrics, with precision, recall, and accuracy all at 72%.These results suggest that the proposed feature selection approach effectively enhances disease prediction accuracy.
However, it is observed that the NB classifier produced performance metrics of 79% when using the selected features, compared with 82% when using all features.Additionally, the EX and MLP classifiers produced lower metric values, with 97% and 81% respectively, when using the reduced feature set, compared to 100% and 83% respectively when using all features.These results indicate that using fewer discriminative features achieves similar or close classification performance compared to utilizing all features.In conclusion, reducing the feature size by using fewer features is a better option than using all features to minimize complexity overhead costs [43][44][45] .Table 6 below displays the performance metrics obtained from the experiment conducted on the Medical dataset for both selected features and full features.It was observed that classifiers EX and LR produced better results when using reduced features as compared to their corresponding classifiers which utilized all features.For instance, the EX classifier recorded the highest precision, recall, and accuracy scores of 97%, 97%, and 97% respectively, when using selected features as compared to other classifiers.The LR classifier also achieved better precision, recall, and accuracy scores of 81%, 80 %, and 80 % respectively, when using selected features as compared to full features.Furthermore, utilizing feature selection approaches improved the classification performance for each metric with reduced features using the MLP classifier, increasing from 73% to 75%.The results indicated that using a fewer number of significant features is useful in classifying objects, resulting in improved results and reduced computational cost overhead 5,43,45 .In conclusion, some classifiers show better results when using all features, but the difference is negligible as compared to using selected features 1,18,45 .

Study Comparison with Earlier Works
This study conducted a comparative analysis of various feature selection methods employed in predicting heart disease.

Discussion
After analysing the tables presented above, it is clear that selecting fewer important features can improve the classification performance.For example, the Heart-disease and Heart-disease prediction dataset's accuracy rates increased from 0.67 to 0.83 and 0.76, respectively.
Table 3 demonstrates that the MLP classifier achieved the highest accuracy rate of 0.85 with the reduced features for the Heart-disease dataset.Moreover, among the classifier algorithms tested on the Medical dataset, the extra tree classifier, when trained with the optimal reduced feature set, exhibited the highest accuracy (97%), surpassing alternative approaches.
In conclusion, the experimental findings suggest that utilizing only relevant features with specific classifier algorithms can significantly enhance machine learning model performance [46][47][48] .This approach yields comparable or nearly identical results to models that utilize the full feature set across all datasets.The study recommends utilizing a reduced number of features, and the results emphasize the impact of feature selection techniques in decreasing the feature space's size, improving machine learning model performance, and lowering complexity overhead cost.Analysis indicates that even with a limited number of features, machine learning models display superior performance compared to models that use the full features set.

Conclusion and Future Work
The development of massive clinical data has led to a significant challenge for disease prediction due to the huge volume of data and associated features.These features could be redundant and irrelevant, and do not provide significant information to the predictive task and can decay the accuracy of the disease prediction.Therefore, the principal aim of this paper is to investigate the impact of the feature selection method in improving the classification performance by decreasing the feature space size by proposing an extra tree feature selection-based approach to identify the most crucial features that are close to the class target and enhance the prediction accuracy.This investigation was carried out by utilizing set of features obtained from commonly utilized heart disease datasets accessible via the Google dataset tool.Experiments were undertaken to investigate the impact of the suggested approach for feature selection on the performance of prediction with heart disease or not with similar or close classification performance results compared to using all features with evident enhancements in classification performance.However, some classifiers with the reduced features performed similar or close results compared with the full features.So, it is recommended in the future to use some statistical tests to find out a suitable classifier that achieves better results in predicting heart disease with the most optimal selected features.In addition, other feature selection approaches used for extracting the most important features in predicting heart disease could be explored in future work.

Figure 2 .Figure 3 .
Figure 2. Some samples of Heart Disease dataset

Figure 4 .
Figure 4. Some samples of Heart Disease Dataset

Figure 5 .
Figure 5.Some samples of Medical dataset

Figure 9 .
Figure 9. Feature subset selected for fourth dataset (Medical dataset)Overall, it is concluded from the experiment that chest pain (cp) and the number of vessels colored by fluoroscopy (CA) are the most representative features in predicting heart disease9.

For
SVM: https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal . The experiments were carried out in two scenarios, https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal one with feature selection and the other without feature selection.Four datasets, varying in sample sizes and features, were employed for these experiments.The analysis encompassed seven classifier algorithms: SVM, KNN, EX, NB, LDA, MLP, and LR classifiers.Furthermore, the models underwent evaluation based on precision, recall, and accuracy metrics.The Extra Tree model outperformed all other models when using the selected features, achieving an accuracy rate of 97 % on the Medical dataset.The experimental outcomes underscore the promise of employing an extra tree feature selection-based technique to extract the most distinctive features that help in diagnosing patients used χ2 statistical optimum feature selection technique to get the most significant https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal

Table 3 . The performance result of classification models for Heart-disease dataset on the selected and all features.
Additionally, the MLP classifier has achieved more favourable results with the reduced features set, achieving a rate of 83% compared to a rate of 77% with the full set of features.These results unequivocally demonstrate https://doi.org/10.21123/bsj.2024.9711P-ISSN: 2078-8665 -E-ISSN: 2411-7986 Baghdad Science Journal