Fake News Detection Model Basing on Machine Learning Algorithms

Authors

DOI:

https://doi.org/10.21123/bsj.2024.8710

Keywords:

Classification, Decision Tree, Gradient Boosting, Logistic Regression, Random Forest

Abstract

The rapid growth of the internet and easy communication has made it quick and simple to create and spread news. Social media users now generate and share more information than before, but some of it is false and unrelated to reality. Detecting false information in text is challenging, even for experts who need to consider multiple factors to determine authenticity. Malicious misinformation on social media negatively affects societies, especially during crises like terrorist attacks, riots, and natural disasters. To minimize the harmful impact, it is crucial to identify rumors quickly. This study aims to build a learning model for detecting fake news. This research paper relies on finding and analyzing the characteristics of the text, then the words are converted into features using TF-IDF technology, after that the highest-ranking features are identified for the purpose of studying and distinguishing the spread of news, whether it is real or fake using machine learning techniques. Finally, the Logistic Regression, Decision Tree, Gradient Boosting and Random Forest algorithm has been adapted. The accuracy of Logistic Regression is 0.985, Random Forest (0.989) whereas the accuracy of Decision Tree is 0.994 and Gradient Boosting (0.9949), respectively.

References

Ahmad I, Yousaf M, Yousaf S, Ahmad MO. Fake News Detection Using Machine Learning Ensemble Methods. Complexity. 2020: 1-11. https://doi.org/10.1155/2020/8885861.

Awan MJ, Yasin A, Nobanee H, Ali AA, Shahzad Z, Nabeel M, et al. Fake news data exploration and analytics. Electron. 2021; 10, 19: 2326. https://doi.org/10.3390/ electronics10192326 .

Shu K, Sliva A, Wang S, Tang J, Liu H. Fake News Detection on Social Media. Int J Inf Secur. 2023; 22: 177–212. https://doi.org/10.1007/s10207-022-00625-3.

Khanam Z, Alwasel BN, Sirafi H, Rashid M. Fake News Detection Using Machine Learning Approaches. IOP Conf Ser Mater Sci Eng. 2021, 012040 IOP https://doi.org/10.1088/1757-899X/1099/1/012040

Ali I, Ayub MN Bin, Shivakumara P, Noor NFBM. Fake News Detection Techniques on Social Media: A Survey. Wirel Commun Mob Comput. 2022, Article ID 6072084,2022. https://doi.org/10.1155/2022/6072084 .

Haque S, Eberhart Z, Bansal A, McMillan C. The Future of Misinformation Detection: New Perspectives and Trends. IEEE Int Conf Progr Compr. 2022: 36–47. https://doi.org/10.48550/arXiv.1909.03654.

Kaur S, Kumar P, Kumaraguru P. Automating fake news detection system using multi-level voting model. Soft Comput. 2020; 24(12): 9049–69. https://doi.org/10.1007/s00500-019-04436-y

Dixit DK, Bhagat A, Dangi D. Automating fake news detection using PPCA and levy flight-based LSTM. Soft Comput. 2022; 26(22): 12545–57. https://doi.org/10.1007/s00500-022-07215-4

Nakamura K, Levy S, Wang WY. r/Fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection. Lr 2020 - 12th Int Conf Lang Resour Eval Conf Proc. 2020; 6149–57.

Abdelminaam DS, Ismail FH, Taha M, Taha A, Houssein EH, Nabil A. CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter. IEEE Access. 2021; 27840–67. https://doi.org/ 10.1109/ACCESS.2021.3058066

Ghadiri Z, Ranjbar M, Ghanbarnejad F, Raeisi S. Automated Fake News Detection using cross-checking with reliable sources. arXiv. 2021; 1–12. https://doi.org/10.48550/arXiv.2201.00083

Al-Ahmad B, Al-Zoubi AM, Ruba AK, Ibrahim A. An Evolutionary Fake News Detection Method for COVID-19. Symmetry (Basel). 2021; 1–16. https://doi.org/10.3390/sym13061091

Islam MR, Liu S, Wang X, Xu G. Deep learning for misinformation detection on online social networks: a survey and new perspectives. Soc Netw Anal Min. 2020; 10(1): 1–20. https://doi.org/10.1007/s13278-020-00696-x

Sharma S, Sharma DK. Fake News Detection: A long way to go. 2019 4th Int Conf Inf Syst Comput Netw, 2019; 816–21. https://doi.org/ 10.1109/ISCON47742.2019.9036221

Stitini O, Kaloun S, Bencharef O. Towards the Detection of Fake News on Social Networks Contributing to the Improvement of Trust and Transparency in Recommendation Systems: Trends and Challenges. Info. 2022; 13(3): 128. https://doi.org/10.3390/info13030128

Rastogi S, Bansal D. A review on fake news detection 3T’s: typology, time of detection, taxonomies. Int J Inf Secur . 2023; 22(1): 177–212. https://doi.org/10.1007/s10207-022-00625-3

Rani N, Das P, Bhardwaj AK. Rumor, misinformation among web: A contemporary review of rumor detection techniques during different web waves. Concurr Comput Pract Exp. 2022; 34(1): 1–21. https://doi.org/10.1002/cpe.6479

Ismael Kadhim A, Cheah Y-N, Abbas Hieder I, Ahmed Ali R. Improving TF-IDF with Singular Value Decomposition (SVD) for Feature Extraction on Twitter. Int Eng Conf. 2017; 144–52. https://doi.org/10.23918/iec2017.16

Wotaifi TA, Dhannoon BN. Improving Prediction of Arabic Fake News Using Fuzzy Logic and Modified Random Forest Model. Karbala Int J Mod Sci. 2022; 8(3): 477–85. https://doi.org/10.33640/2405-609x.3241

Das B, Chakraborty S. An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation. arXiv. 2018; 1-6. https://doi.org/10.48550/arXiv.1806.06407

Utsha RS, Keya M, Hasan MA, Islam MS. Qword at CheckThat! 2021: An extreme gradient boosting approach for multiclass fake news detection. CEUR Workshop Procding. 2021; 2936: 619–27. CEUR.org.

Poręba J, Baranowski J. Functional Logistic Regression for Motor Fault Classification Using Acoustic Data in Frequency Domain. Energies. 2022; 15(15). https://doi.org/10.3390/en15155535

Boateng EY, Abaye DA. A Review of the Logistic Regression Model with Emphasis on Medical Research. J Data Anal Inf Process. 2019; 07(04): 190–207. https://doi.org/10.4236/jdaip.2019.74012

Sievering AW, Wohlmuth P, Geßler N, Gunawardene MA, Herrlinger K, Bein B, et al. Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission. BMC Med Inform Decis Mak . 2022; 22(1): 1–14. https://doi.org/10.1186/s12911-022-02057-4

Ong AKS, Prasetyo YT, Yuduang N, Nadlifatin R, Persada SF, Robas KPE, et al. Utilization of Random Forest Classifier and Artificial Neural Network for Predicting Factors Influencing the Perceived Usability of COVID-19 Contact Tracing “MorChana” in Thailand. Int J Environ Res Public Health. 2022; 19(13): 1-28. https://doi.org/10.3390/ijerph19137979

Popuri SK. An Approximation Method for Fitted Random Forests. Arvxiv. 2022; 2207: 02184v1. https://doi.org/10.48550/arXiv.2207.02184

Lagrois D, Bonnell TR, Shukla A, Chion C. The Gradient-Boosting Method for Tackling High Computing Demand in Underwater Acoustic Propagation Modeling. J Mar Sci Eng. 2022; 10(7): 899, https://doi.org/10.3390/ jmse10070899

Adler AI, Painsky A. Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection. Entropy. 2022; 24(5): 687. https://doi.org/10.3390/ e24050687

Meshoul S, Batouche A, Shaiba H, AlBinali S. Explainable Multi-Class Classification Based on Integrative Feature Selection for Breast Cancer Subtyping. Math. 2022; 10(22). https://doi.org/10.3390/math10224271

Palaniappan M, Desingu K, Bharathi H, Chodisetty EA, Bhaskar A. Deep Learning and Gradient Boosting Ensembles for Classification of Snake Species. CEUR Workshop Procding. 2022; 3180: 2175–88.

Heydarian M, Doyle TE, Samavi R. MLCM: Multi-Label Confusion Matrix. IEEE Access. 2022; 10: 19083–95. https://doi.org/10.1109/ACCESS.2022.3151048

TAW, Dhannoon BN. An Effective Hybrid Deep Neural Network for Arabic Fake News Detection. Baghdad Sci J. 2023; 20(2) https://dx.doi.org/10.21123/bsj.2023.7427

Al-Shareeda, M.A.; Manickam, S. Man-in-the-Middle Attacks in Mobile Ad Hoc Networks (MANETs): Analysis and Evaluation. Symmetry 2022; 14: 1543. https://doi.org/10.3390/sym14081543

Baidea A Mohammed, Selvakumar Manickam, Zeyad Ghaleb, Abdulrahman Alreshaid, Meshari Alazam, Jalwal Sulaimani, et al. FC-PA: Fog Computing-Based Pseudonym Authentication Scheme in 5G-Enabled Vehicular Networks. IEEE Access. 2023. 11: 3-12. 18571-18581. https://doi.org/10.1109/ACCESS.2023.3247222

Taha MA, Ahmed HM. A fuzzy vault development based on iris images. EUREKA Phys Eng g. 2021; 5: 3-12. https://doi.org/10.21303/2461-4262.2021.001997.

Saha S, Dasgupta S, Anam A, Saha R, Nath S, Dutta S. An Investigation of Suicidal Ideation from Social Media Using Machine Learning Approach. Baghdad Sci J. 2023; 20(3(Suppl.) :1164. https://doi.org/10.21123/bsj.2023.8515

Kareem AK, AL-Ani MM, Nafea AA. Detection of Autism Spectrum Disorder Using A 1-Dimensional Convolutional Neural Network. Baghdad Sci J. 2023; 20(3(Suppl.): 1182. https://doi.org/10.21123/bsj.2023.8564

Downloads

Issue

Section

article

How to Cite

1.
Fake News Detection Model Basing on Machine Learning Algorithms. Baghdad Sci.J [Internet]. [cited 2024 Apr. 30];21(8). Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/8710