•  
  •  
 

Abstract

Web phishing attacks have been continually evolving over the past few years, which has led customers to lose their trust in online services and e-commerce. To identify phishing data, a variety of methods and systems based on a blacklist of phishing websites are used. However, the rapid development of technology has given rise to increasingly complex techniques for creating user-attracting websites. Therefore, current blacklist-based techniques are unable to identify the recently launched phishing data, such as zero-day phishing websites. Machine learning techniques have been used in several recent studies to detect phishing data and function as an early warning system for such attacks. However, majority of these methods selected the most significant features of websites based on frequency analysis or human experience. In order to improve the classification of phishing data, this research proposes the hybrid MI-AN method for intelligent phishing data classification using three phishing datasets. We have used six machine learning models with the proposed MI-AN feature selection method. The experimental results indicated that the proposed MI-AN method achieved significant improvements in classification accuracy. Compared to other models, the multi-layer perceptron achieves higher accuracy (97.07%) on dataset 1, the random forest model achieves higher accuracy (96.60% and 97.78%) on both datasets 2 and 3. We have also compared proposed method to a previously published research work, and our method achieves better performance. Overall, the proposed MI-AN method consistently enhanced model performance across all phishing datasets, demonstrating robustness, adaptability, and effectiveness in eliminating irrelevant features to achieve better generalization.

Keywords

ANOVA feature selection, Hybrid feature selection, Machine learning, Mutual information, Phishing datasets

Subject Area

Mathematics

Article Type

Article

First Page

1920

Last Page

1940

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

 
COinS