A Systematic Review on Sentiment Analysis for Sindhi Text
DOI:
https://doi.org/10.21123/bsj.2024.10954Keywords:
معالجة اللغات الطبيعية، تحليل المشاعر، مجموعة النصوص السندية، النص السندي، المراجعة المنهجية، المعالجة المسبقة للنص.Abstract
The field of sentiment analysis has experienced significant growth in recent years due to its applications in various domains such as news headlines, online product purchase, marketing, and reputation management. With the rise of social media and online shopping platforms, there is a wealth use-generated data available. This has led manufacturing, sales, and marketing companies to seek global feedback on their practices and products from these sources. In the context of Sindhi language, millions of phrases are shared daily on news media sites, Twitter, Facebook, and other platforms. However, the exclusion of sentiment analysis for Sindhi language limits the utilization of this vast amount of data, focusing primarily on the resource-rich English language. This systematic review aims to collect and evaluate published research related to Sindhi language sentiment analysis, specifically focusing on pre-processing, feature extraction, classification methods. The study offers a comprehensive analysis of research conducted on Sindhi text for product evaluation, covering key areas, such as relevant corpora acquisition, data preprocessing, feature extraction, classification techniques, methodologies, limitations, and future directions. Each reviewed article is assessed and classified based on specified criteria. The findings of this review provide valuable insights and propose several approaches for future investigations in this area.
Received 15/02/2024
Revised 11/08/2024
Accepted 13/08/2024
Published Online First 20/11/2024
References
Al-Bakri NF, Yonan JF, Sadiq AT, Abid AS. Tourism Companies Assessment via Social Media using Sentiment Analysis. Baghdad Sci J. 2022; 19(2): 422–9. https://doi.org/10.21123/BSJ.2022.19.2.0422
Al-Jumaili ASA, Tayyeh HK. A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis. Baghdad Sci J. 2020; 17(1): 385-390. https://dx.doi.org/10.21123/bsj.2020.17.1(Suppl.).0385
Mutasher WG, Aljuboori AF. New and Existing Approaches Reviewing of Big Data Analysis with Hadoop Tools. Baghdad Sci J. 2022; 19(4): 887–98. https://doi.org/ 10.21123/bsj.2022.19.4.0887
Zaki UHH, Ibrahim R, Abd-Halim S, Kamsani II. Prioritize Text Detergent: Comparing Two Judgement Scales of Analytic Hierarchy Process on Prioritizing Pre-Processing Techniques on Social Media Sentiment Analysis. Baghdad Sci J. 2024; 21(2): 0662-0683. https://doi.org/10.21123/bsj.2024.9750
Motlani R. Developing language technology tools and resources for a resource-poor language: Sindhi. In Proceedings of the NAACL Student Research Workshop, 2016; 51–58. https://doi.org/10.18653/v1/N16-2008
Mukherjee S. Sindhi language and its history. L D, Kolkata, 2018.
Jamro WA. Sindhi Language Processing: A Survey. Conference: 2017 International Conference on Innovations in Electrical Engineering and Computational Technologies. (ICIEECT), 2017. https://doi.org/10.1109/ICIEECT.2017.7916560
Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency- based semantic parsing: a novel approach. Springer, 2015. https://doi.org/10.1007/s12559-014-9316-6
Bhadane C, Dalal H, Doshi H. Sentiment analysis: measuring opinions. Procedia Comput Sci, 2015; 45: 808-814. https://doi.org/10.1016/j.procs.2015.03.159
de Albornoz JC, Plaza L, Gervás P. SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC), 2012.
Zhang L, Ghosh R, Dekhil M, Hsu M, Liu B. Combining lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories, 2011.
Tripathy A, Agrawal A, Rath SK. Classification of sentiment reviews using n-gram machine learning approach. Expert Sys Appl. 2016; 117-126. https://doi.org/10.1016/j.eswa.2016.03.028
Peng H, Cambria E, Hussain A. A Review of Sentiment Analysis Research in Chinese Language. Springer, Aug. 2017; 9(4): 423–435. https://doi.org/10.1007/s12559-017-9470-8
Manek AS, Shenoy PD, Mohan MC, Venugopal KR. Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, 2017; 135-154. https://doi.org/10.1007/s11280-015-0381-x
Erra U, Senatore S, Minnella F, Caggianese G. Approximate TF–IDF based on topic extraction from massive message stream using the GPU. Inf Sci. Jan. 2015; 292: 143–161. https://doi.org/10.1016/j.ins.2014.08.062
Nazir S, Nawaz M, Adnan A, Shahzad S, Asadi S. Big data features, applications, and analytics in cardiology—A systematic literature review. IEEE Access. 2019; 7: 143742–143771. https://doi.org/10.1109/ACCESS.2019.2941898
Nazir S, Shahzad S, Mukhtar N. Software birthmark design and estimation: A systematic literature review. Arab J Sci Eng. 2019; 44: 3905–3927. https://doi.org/10.1007/s13369-019-03718-9
Keele S. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Version 2.3, EBSE Technical Report, Keele University and Durham University Joint Report; EBSE: Keele, UK, 2007; p. 1–57.
Ali W, Ali N, Dai Y, Kumar J, Tumrani S, Xu Z. Creating and Evaluating Resources for Sentiment Analysis in the Low-resource Language: Sindhi. Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. April 19, 2021; 188–194.
Surahio FA, Mahar JA. Prediction System for Sindhi Parts of Speech Tags by Using Support Vector Machine. International Conference on Computing, Mathematics and Engineering Technologies. iCoMET, 2018. https://doi.org/10.1109/ICOMET.2018.8346331
Ali M, Wagan AI. AN Analysis of Annotated Corpus using Supervised Machine Learning Methods. Mehran Uni Res J Eng Technol. Jan 2019; 38(1): 185-196. https://doi.org/10.22581/muet1982.1901.15
Sodhar IN, Sulaiman S, Buller AH, Sodhar AN. Aspect-Based Sentiment Analysis of Sindhi Newspaper Articles. Int J Comput Netw Secur. May 2022; 22(5). https://doi.org/10.22937/IJCSNS.2022.22.5.54
Hammad M, Anwar H. Sentiment Analysis of Sindhi Tweets Dataset using Supervised Machine Learning Techniques. 22nd Int Multitopic Conf. (INMIC), 2019. https://doi.org/10.1109/INMIC48123.2019.9022770
Dootio MA, Wagan AI. Development of Sindhi Text Corpus. J King Saud Univ – Comput Inf Sci. 33, 2021; 468–475. https://doi.org/10.1016/j.jksuci.2019.02.002
Mahar JA, Memon GQ, Danwar SH. Algorithms for Sindhi Word Segmentation Using Lexicon-Driven Approach. Int J A R. 2011; 3(3).
Narejo WA, Mahar JA, Mahar SA, Surahio FA, Jumani AK. Sindhi Morphological Analysis: An Algorithm for Sindhi Word Segmentation into Morphemes. Int J Comput. Sci Inf. Sec. (IJCSIS).14, June 2016; 14(6):293-302.
Mahar JA, Memon GQ. Rule Based Part of Speech Tagging of Sindhi Language. Int Conf. Signal Acquisition and Processing, 2010. https://doi.org/10.1109/ICSAP.2010.27
Al-Jumaili ASA, Tayyeh HK. A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis. Baghdad Sci J 2020, 17(1): 385-390. https://dx.doi.org/10.21123/bsj.2020.17.1(Suppl.).0385
Noureen, Huspi SH, Ali Z. Sentiment Analysis on Roman Urdu Students’ Feedback Using Enhanced Word Embedding Technique. Baghdad Sci J. 2024, 21(2): 0725-0739 https://doi.org/10.21123/bsj.2024.9822
Sharma H, Kumar S. A survey on decision tree algorithms of classification in data mining. Int J Sci Res., 2016; 5: 2094–2097.
Yang H, Fong S. Optimized very fast decision tree with balanced classification accuracy and compact tree size. In Proceedings of the 3rd Int Conf. on Data Mining and Intelligent Inf Tech Appl., Vienna, Austria, 29–31 August 2014; 57–64.
Ali W, Xu Z, Kumar J. SiPOS: A Benchmark Dataset for Sindhi Part-of-Speech Tagging. Proceedings of the Student Research Workshop associated with RANLP- Sep 1-3, 2021; 22–30.
Dootio MA, Wagan AI. Unicode-8 based linguistics data set of annotated Sindhi text. Data in Brief, 2018; 19: 1504–1514. https://doi.org/10.1016/j.dib.2018.05.062
Ali M, Wagan AI. Sentiment Summarization and Analysis of Sindhi Text. (IJACSA) Int J Adv Comput Sci Appl. 2017; 8(10): 296-300. https://doi.org/10.14569/ijacsa.2017.081038
Downloads
Issue
Section
License
Copyright (c) 2024 Safdar Ali Soomro, Siti Sophiayati Yuhaniz, Mazhar Ali Dootio, Ghulam Murtaza, Muhammad Hussain Mughal
This work is licensed under a Creative Commons Attribution 4.0 International License.