A Systematic Review on Sentiment Analysis for Sindhi Text

Authors

  • Safdar Ali Soomro Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia. https://orcid.org/0009-0002-7616-2308
  • Siti Sophiayati Yuhaniz Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia.
  • Mazhar Ali Dootio Department of Computer Science, Benazir Bhutto Shaheed University, Karachi, Pakistan.
  • Ghulam Murtaza Department of Computer Science, Sukkur IBA University, Sukkur, Pakistan
  • Muhammad Hussain Mughal Department of Computer Science, Sukkur IBA University, Sukkur, Pakistan https://orcid.org/0000-0002-2035-7205

DOI:

https://doi.org/10.21123/bsj.2024.10954

Keywords:

معالجة اللغات الطبيعية، تحليل المشاعر، مجموعة النصوص السندية، النص السندي، المراجعة المنهجية، المعالجة المسبقة للنص.

Abstract

The field of sentiment analysis has experienced significant growth in recent years due to its applications in various domains such as news headlines, online product purchase, marketing, and reputation management. With the rise of social media and online shopping platforms, there is a wealth use-generated data available. This has led manufacturing, sales, and marketing companies to seek global feedback on their practices and products from these sources. In the context of Sindhi language, millions of phrases are shared daily on news media sites, Twitter, Facebook, and other platforms. However, the exclusion of sentiment analysis for Sindhi language limits the utilization of this vast amount of data, focusing primarily on the resource-rich English language. This systematic review aims to collect and evaluate published research related to Sindhi language sentiment analysis, specifically focusing on pre-processing, feature extraction, classification methods. The study offers a comprehensive analysis of research conducted on Sindhi text for product evaluation, covering key areas, such as relevant corpora acquisition, data preprocessing, feature extraction, classification techniques, methodologies, limitations, and future directions. Each reviewed article is assessed and classified based on specified criteria. The findings of this review provide valuable insights and propose several approaches for future investigations in this area.

References

Al-Bakri NF, Yonan JF, Sadiq AT, Abid AS. Tourism Companies Assessment via Social Media using Sentiment Analysis. Baghdad Sci J. 2022; 19(2): 422–9. https://doi.org/10.21123/BSJ.2022.19.2.0422

Al-Jumaili ASA, Tayyeh HK. A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis. Baghdad Sci J. 2020; 17(1): 385-390. https://dx.doi.org/10.21123/bsj.2020.17.1(Suppl.).0385

Mutasher WG, Aljuboori AF. New and Existing Approaches Reviewing of Big Data Analysis with Hadoop Tools. Baghdad Sci J. 2022; 19(4): 887–98. https://doi.org/ 10.21123/bsj.2022.19.4.0887

Zaki UHH, Ibrahim R, Abd-Halim S, Kamsani II. Prioritize Text Detergent: Comparing Two Judgement Scales of Analytic Hierarchy Process on Prioritizing Pre-Processing Techniques on Social Media Sentiment Analysis. Baghdad Sci J. 2024; 21(2): 0662-0683. https://doi.org/10.21123/bsj.2024.9750

Motlani R. Developing language technology tools and resources for a resource-poor language: Sindhi. In Proceedings of the NAACL Student Research Workshop, 2016; 51–58. https://doi.org/10.18653/v1/N16-2008

Mukherjee S. Sindhi language and its history. L D, Kolkata, 2018.

Jamro WA. Sindhi Language Processing: A Survey. Conference: 2017 International Conference on Innovations in Electrical Engineering and Computational Technologies. (ICIEECT), 2017. https://doi.org/10.1109/ICIEECT.2017.7916560

Agarwal B, Poria S, Mittal N, Gelbukh A, Hussain A. Concept-level sentiment analysis with dependency- based semantic parsing: a novel approach. Springer, 2015. https://doi.org/10.1007/s12559-014-9316-6

Bhadane C, Dalal H, Doshi H. Sentiment analysis: measuring opinions. Procedia Comput Sci, 2015; 45: 808-814. https://doi.org/10.1016/j.procs.2015.03.159

de Albornoz JC, Plaza L, Gervás P. SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC), 2012.

Zhang L, Ghosh R, Dekhil M, Hsu M, Liu B. Combining lexicon-based and learning-based methods for twitter sentiment analysis. HP Laboratories, 2011.

Tripathy A, Agrawal A, Rath SK. Classification of sentiment reviews using n-gram machine learning approach. Expert Sys Appl. 2016; 117-126. https://doi.org/10.1016/j.eswa.2016.03.028

Peng H, Cambria E, Hussain A. A Review of Sentiment Analysis Research in Chinese Language. Springer, Aug. 2017; 9(4): 423–435. https://doi.org/10.1007/s12559-017-9470-8

Manek AS, Shenoy PD, Mohan MC, Venugopal KR. Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web, 2017; 135-154. https://doi.org/10.1007/s11280-015-0381-x

Erra U, Senatore S, Minnella F, Caggianese G. Approximate TF–IDF based on topic extraction from massive message stream using the GPU. Inf Sci. Jan. 2015; 292: 143–161. https://doi.org/10.1016/j.ins.2014.08.062

Nazir S, Nawaz M, Adnan A, Shahzad S, Asadi S. Big data features, applications, and analytics in cardiology—A systematic literature review. IEEE Access. 2019; 7: 143742–143771. https://doi.org/10.1109/ACCESS.2019.2941898

Nazir S, Shahzad S, Mukhtar N. Software birthmark design and estimation: A systematic literature review. Arab J Sci Eng. 2019; 44: 3905–3927. https://doi.org/10.1007/s13369-019-03718-9

Keele S. Guidelines for Performing Systematic Literature Reviews in Software Engineering. Version 2.3, EBSE Technical Report, Keele University and Durham University Joint Report; EBSE: Keele, UK, 2007; p. 1–57.

Ali W, Ali N, Dai Y, Kumar J, Tumrani S, Xu Z. Creating and Evaluating Resources for Sentiment Analysis in the Low-resource Language: Sindhi. Proceedings of the 11th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. April 19, 2021; 188–194.

Surahio FA, Mahar JA. Prediction System for Sindhi Parts of Speech Tags by Using Support Vector Machine. International Conference on Computing, Mathematics and Engineering Technologies. iCoMET, 2018. https://doi.org/10.1109/ICOMET.2018.8346331

Ali M, Wagan AI. AN Analysis of Annotated Corpus using Supervised Machine Learning Methods. Mehran Uni Res J Eng Technol. Jan 2019; 38(1): 185-196. https://doi.org/10.22581/muet1982.1901.15

Sodhar IN, Sulaiman S, Buller AH, Sodhar AN. Aspect-Based Sentiment Analysis of Sindhi Newspaper Articles. Int J Comput Netw Secur. May 2022; 22(5). https://doi.org/10.22937/IJCSNS.2022.22.5.54

Hammad M, Anwar H. Sentiment Analysis of Sindhi Tweets Dataset using Supervised Machine Learning Techniques. 22nd Int Multitopic Conf. (INMIC), 2019. https://doi.org/10.1109/INMIC48123.2019.9022770

Dootio MA, Wagan AI. Development of Sindhi Text Corpus. J King Saud Univ – Comput Inf Sci. 33, 2021; 468–475. https://doi.org/10.1016/j.jksuci.2019.02.002

Mahar JA, Memon GQ, Danwar SH. Algorithms for Sindhi Word Segmentation Using Lexicon-Driven Approach. Int J A R. 2011; 3(3).

Narejo WA, Mahar JA, Mahar SA, Surahio FA, Jumani AK. Sindhi Morphological Analysis: An Algorithm for Sindhi Word Segmentation into Morphemes. Int J Comput. Sci Inf. Sec. (IJCSIS).14, June 2016; 14(6):293-302.

Mahar JA, Memon GQ. Rule Based Part of Speech Tagging of Sindhi Language. Int Conf. Signal Acquisition and Processing, 2010. https://doi.org/10.1109/ICSAP.2010.27

Al-Jumaili ASA, Tayyeh HK. A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis. Baghdad Sci J 2020, 17(1): 385-390. https://dx.doi.org/10.21123/bsj.2020.17.1(Suppl.).0385

Noureen, Huspi SH, Ali Z. Sentiment Analysis on Roman Urdu Students’ Feedback Using Enhanced Word Embedding Technique. Baghdad Sci J. 2024, 21(2): 0725-0739 https://doi.org/10.21123/bsj.2024.9822

Sharma H, Kumar S. A survey on decision tree algorithms of classification in data mining. Int J Sci Res., 2016; 5: 2094–2097.

Yang H, Fong S. Optimized very fast decision tree with balanced classification accuracy and compact tree size. In Proceedings of the 3rd Int Conf. on Data Mining and Intelligent Inf Tech Appl., Vienna, Austria, 29–31 August 2014; 57–64.

Ali W, Xu Z, Kumar J. SiPOS: A Benchmark Dataset for Sindhi Part-of-Speech Tagging. Proceedings of the Student Research Workshop associated with RANLP- Sep 1-3, 2021; 22–30.

Dootio MA, Wagan AI. Unicode-8 based linguistics data set of annotated Sindhi text. Data in Brief, 2018; 19: 1504–1514. https://doi.org/10.1016/j.dib.2018.05.062

Ali M, Wagan AI. Sentiment Summarization and Analysis of Sindhi Text. (IJACSA) Int J Adv Comput Sci Appl. 2017; 8(10): 296-300. https://doi.org/10.14569/ijacsa.2017.081038

Downloads

Issue

Section

article

How to Cite

1.
A Systematic Review on Sentiment Analysis for Sindhi Text. Baghdad Sci.J [Internet]. [cited 2025 Jan. 26];22(5). Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/10954