Fake News Detection Model Basing on Machine Learning Algorithms
Main Article Content
Abstract
The rapid growth of the internet and easy communication has made it quick and simple to create and spread news. Social media users now generate and share more information than before, but some of it is false and unrelated to reality. Detecting false information in text is challenging, even for experts who need to consider multiple factors to determine authenticity. Malicious misinformation on social media negatively affects societies, especially during crises like terrorist attacks, riots, and natural disasters. To minimize the harmful impact, it is crucial to identify rumors quickly. This study aims to build a learning model for detecting fake news. This research paper relies on finding and analyzing the characteristics of the text, then the words are converted into features using TF-IDF technology, after that the highest-ranking features are identified for the purpose of studying and distinguishing the spread of news, whether it is real or fake using machine learning techniques. Finally, the Logistic Regression, Decision Tree, Gradient Boosting and Random Forest algorithm has been adapted. The accuracy of Logistic Regression is 0.985, Random Forest (0.989) whereas the accuracy of Decision Tree is 0.994 and Gradient Boosting (0.9949), respectively.
Received 07/03/2023
Revised 14/07/2023
Accepted 16/07/2023
Published Online First 20/01/2024
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Ahmad I, Yousaf M, Yousaf S, Ahmad MO. Fake News Detection Using Machine Learning Ensemble Methods. Complexity. 2020: 1-11. https://doi.org/10.1155/2020/8885861.
Awan MJ, Yasin A, Nobanee H, Ali AA, Shahzad Z, Nabeel M, et al. Fake news data exploration and analytics. Electron. 2021; 10, 19: 2326. https://doi.org/10.3390/ electronics10192326 .
Shu K, Sliva A, Wang S, Tang J, Liu H. Fake News Detection on Social Media. Int J Inf Secur. 2023; 22: 177–212. https://doi.org/10.1007/s10207-022-00625-3.
Khanam Z, Alwasel BN, Sirafi H, Rashid M. Fake News Detection Using Machine Learning Approaches. IOP Conf Ser Mater Sci Eng. 2021, 012040 IOP https://doi.org/10.1088/1757-899X/1099/1/012040
Ali I, Ayub MN Bin, Shivakumara P, Noor NFBM. Fake News Detection Techniques on Social Media: A Survey. Wirel Commun Mob Comput. 2022, Article ID 6072084,2022. https://doi.org/10.1155/2022/6072084 .
Haque S, Eberhart Z, Bansal A, McMillan C. The Future of Misinformation Detection: New Perspectives and Trends. IEEE Int Conf Progr Compr. 2022: 36–47. https://doi.org/10.48550/arXiv.1909.03654.
Kaur S, Kumar P, Kumaraguru P. Automating fake news detection system using multi-level voting model. Soft Comput. 2020; 24(12): 9049–69. https://doi.org/10.1007/s00500-019-04436-y
Dixit DK, Bhagat A, Dangi D. Automating fake news detection using PPCA and levy flight-based LSTM. Soft Comput. 2022; 26(22): 12545–57. https://doi.org/10.1007/s00500-022-07215-4
Nakamura K, Levy S, Wang WY. r/Fakeddit: A new multimodal benchmark dataset for fine-grained fake news detection. Lr 2020 - 12th Int Conf Lang Resour Eval Conf Proc. 2020; 6149–57.
Abdelminaam DS, Ismail FH, Taha M, Taha A, Houssein EH, Nabil A. CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter. IEEE Access. 2021; 27840–67. https://doi.org/ 10.1109/ACCESS.2021.3058066
Ghadiri Z, Ranjbar M, Ghanbarnejad F, Raeisi S. Automated Fake News Detection using cross-checking with reliable sources. arXiv. 2021; 1–12. https://doi.org/10.48550/arXiv.2201.00083
Al-Ahmad B, Al-Zoubi AM, Ruba AK, Ibrahim A. An Evolutionary Fake News Detection Method for COVID-19. Symmetry (Basel). 2021; 1–16. https://doi.org/10.3390/sym13061091
Islam MR, Liu S, Wang X, Xu G. Deep learning for misinformation detection on online social networks: a survey and new perspectives. Soc Netw Anal Min. 2020; 10(1): 1–20. https://doi.org/10.1007/s13278-020-00696-x
Sharma S, Sharma DK. Fake News Detection: A long way to go. 2019 4th Int Conf Inf Syst Comput Netw, 2019; 816–21. https://doi.org/ 10.1109/ISCON47742.2019.9036221
Stitini O, Kaloun S, Bencharef O. Towards the Detection of Fake News on Social Networks Contributing to the Improvement of Trust and Transparency in Recommendation Systems: Trends and Challenges. Info. 2022; 13(3): 128. https://doi.org/10.3390/info13030128
Rastogi S, Bansal D. A review on fake news detection 3T’s: typology, time of detection, taxonomies. Int J Inf Secur . 2023; 22(1): 177–212. https://doi.org/10.1007/s10207-022-00625-3
Rani N, Das P, Bhardwaj AK. Rumor, misinformation among web: A contemporary review of rumor detection techniques during different web waves. Concurr Comput Pract Exp. 2022; 34(1): 1–21. https://doi.org/10.1002/cpe.6479
Ismael Kadhim A, Cheah Y-N, Abbas Hieder I, Ahmed Ali R. Improving TF-IDF with Singular Value Decomposition (SVD) for Feature Extraction on Twitter. Int Eng Conf. 2017; 144–52. https://doi.org/10.23918/iec2017.16
Wotaifi TA, Dhannoon BN. Improving Prediction of Arabic Fake News Using Fuzzy Logic and Modified Random Forest Model. Karbala Int J Mod Sci. 2022; 8(3): 477–85. https://doi.org/10.33640/2405-609x.3241
Das B, Chakraborty S. An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation. arXiv. 2018; 1-6. https://doi.org/10.48550/arXiv.1806.06407
Utsha RS, Keya M, Hasan MA, Islam MS. Qword at CheckThat! 2021: An extreme gradient boosting approach for multiclass fake news detection. CEUR Workshop Procding. 2021; 2936: 619–27. CEUR.org.
Poręba J, Baranowski J. Functional Logistic Regression for Motor Fault Classification Using Acoustic Data in Frequency Domain. Energies. 2022; 15(15). https://doi.org/10.3390/en15155535
Boateng EY, Abaye DA. A Review of the Logistic Regression Model with Emphasis on Medical Research. J Data Anal Inf Process. 2019; 07(04): 190–207. https://doi.org/10.4236/jdaip.2019.74012
Sievering AW, Wohlmuth P, Geßler N, Gunawardene MA, Herrlinger K, Bein B, et al. Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission. BMC Med Inform Decis Mak . 2022; 22(1): 1–14. https://doi.org/10.1186/s12911-022-02057-4
Ong AKS, Prasetyo YT, Yuduang N, Nadlifatin R, Persada SF, Robas KPE, et al. Utilization of Random Forest Classifier and Artificial Neural Network for Predicting Factors Influencing the Perceived Usability of COVID-19 Contact Tracing “MorChana” in Thailand. Int J Environ Res Public Health. 2022; 19(13): 1-28. https://doi.org/10.3390/ijerph19137979
Popuri SK. An Approximation Method for Fitted Random Forests. Arvxiv. 2022; 2207: 02184v1. https://doi.org/10.48550/arXiv.2207.02184
Lagrois D, Bonnell TR, Shukla A, Chion C. The Gradient-Boosting Method for Tackling High Computing Demand in Underwater Acoustic Propagation Modeling. J Mar Sci Eng. 2022; 10(7): 899, https://doi.org/10.3390/ jmse10070899
Adler AI, Painsky A. Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection. Entropy. 2022; 24(5): 687. https://doi.org/10.3390/ e24050687
Meshoul S, Batouche A, Shaiba H, AlBinali S. Explainable Multi-Class Classification Based on Integrative Feature Selection for Breast Cancer Subtyping. Math. 2022; 10(22). https://doi.org/10.3390/math10224271
Palaniappan M, Desingu K, Bharathi H, Chodisetty EA, Bhaskar A. Deep Learning and Gradient Boosting Ensembles for Classification of Snake Species. CEUR Workshop Procding. 2022; 3180: 2175–88.
Heydarian M, Doyle TE, Samavi R. MLCM: Multi-Label Confusion Matrix. IEEE Access. 2022; 10: 19083–95. https://doi.org/10.1109/ACCESS.2022.3151048
TAW, Dhannoon BN. An Effective Hybrid Deep Neural Network for Arabic Fake News Detection. Baghdad Sci J. 2023; 20(2) https://dx.doi.org/10.21123/bsj.2023.7427
Al-Shareeda, M.A.; Manickam, S. Man-in-the-Middle Attacks in Mobile Ad Hoc Networks (MANETs): Analysis and Evaluation. Symmetry 2022; 14: 1543. https://doi.org/10.3390/sym14081543
Baidea A Mohammed, Selvakumar Manickam, Zeyad Ghaleb, Abdulrahman Alreshaid, Meshari Alazam, Jalwal Sulaimani, et al. FC-PA: Fog Computing-Based Pseudonym Authentication Scheme in 5G-Enabled Vehicular Networks. IEEE Access. 2023. 11: 3-12. 18571-18581. https://doi.org/10.1109/ACCESS.2023.3247222
Taha MA, Ahmed HM. A fuzzy vault development based on iris images. EUREKA Phys Eng g. 2021; 5: 3-12. https://doi.org/10.21303/2461-4262.2021.001997.
Saha S, Dasgupta S, Anam A, Saha R, Nath S, Dutta S. An Investigation of Suicidal Ideation from Social Media Using Machine Learning Approach. Baghdad Sci J. 2023; 20(3(Suppl.) :1164. https://doi.org/10.21123/bsj.2023.8515
Kareem AK, AL-Ani MM, Nafea AA. Detection of Autism Spectrum Disorder Using A 1-Dimensional Convolutional Neural Network. Baghdad Sci J. 2023; 20(3(Suppl.): 1182. https://doi.org/10.21123/bsj.2023.8564