Sentiment Analysis on Roman Urdu Students’ Feedback Using Enhanced Word Embedding Technique
Main Article Content
Abstract
Students’ feedback is crucial for educational institutions to assess the performance of their teachers, most opinions are expressed in their native language, especially for people in south Asian regions. In Pakistan, people use Roman Urdu to express their reviews, and this applied in the education domain where students used Roman Urdu to express their feedback. It is very time-consuming and labor-intensive process to handle qualitative opinions manually. Additionally, it can be difficult to determine sentence semantics in a text that is written in a colloquial style like Roman Urdu. This study proposes an enhanced word embedding technique and investigates the neural word Embedding (Word2Vec and Glove) to determine which performs better for Roman Urdu Sentiment analysis. Our suggested model employs the BiLSTM network to maintain the context in both directions and eventually, results for ternary classification are obtained by using the final softmax output layer. A manually labeled data set was used to evaluate the model, data is collected from the HEIs of Pakistan. Model was empirically evaluated on two datasets of Roman Urdu, the newly developed student’s feedback dataset and RUSA-19 publically available data set of Roman Urdu. Our model performs effectively using the word embedding and BiLSTM layer. The proposed model is compared with the baseline models of CNN, RNN, GRU and classic LSTM. The experimental findings demonstrate the proposed model's efficacy with an F1score of 90%.
Received 02/10/2023
Revised 10/02/2024
Accepted 12/02/2024
Published 25/02/2024
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Qutab, I., K.I. Malik, and H. Arooj. Sentiment Analysis for Roman Urdu Text over Social Media, a Comparative Study. arXiv preprint arXiv.2020;16408.
Khan, I.U., et al., A review of Urdu sentiment analysis with multilingual perspective: A case of Urdu and roman Urdu language. Computers. 2021; 11(1): p. 3. https://doi.org/10.3390/computers11010003
Mehmood, F., et al., A precisely xtreme-multi channel hybrid approach for roman urdu sentiment analysis. IEEE Access. 2020; 8: p. 192740-192759.https://doi.org/10.1109/ACCESS.2020.3030885
Masroor, H., et al., Transtech:development of a novel translator for Roman Urdu to English. Heliyon. 2019; 5(5): p. e01780. https://doi.org/10.1016/j.heliyon.2019.e01780
Poria, S., E. Cambria, and A. Gelbukh, Aspect extraction for opinion mining with a deep convolutional neural network. KBS. 2016; 108: p. 4249. http://dx.doi.org/10.1016/j.knosys.2016.06.009
AL-Bakri NF, Yonan JF, Sadiq AT. Tourism companies assessment via social media using sentiment analysis. Baghdad Sci. J. 2022 Apr 1; 19(2):0422. http://dx.doi.org/10.21123/bsj.2022.19.2.0422
Khan, M. and K. Malik. Sentiment classification of customer’s reviews about automobiles in roman urdu. in Advances in Information and Communication Networks: Proceedings of the 2018 Future of Information and Communication Conference (FICC) .2019;Vol. 2. Springer.
AL-Jumaili AS. A hybrid method of linguistic and statistical features for Arabic sentiment analysis. Baghdad Sci. J. 2020 Mar 18; 17(1 (Suppl.)):0385. https://dx.doi.org/10.21123/bsj.2020.17.1(Suppl.).0385
Wang, W., et al. Coupled multi-layer attentions for co-extraction of aspect and opinion terms. In Proceedings of the AAAI Conference on Artificial Intelligence. 2017. https://doi.org/10.1609/aaai.v31i1.10974
Liao, S.N., et al., A robust machine learning technique to predict low-performing students. ACM transactions on computing education (TOCE). 2019; 19(3): p. 1-19. https://doi.org/10.1145/3277569
Chauhan, G.S., P. Agrawal, and Y.K. Meena, Aspect-based sentiment analysis of students’ feedback to improve teaching–learning process. ICT. 2019; Springer. p. 259-266. https://doi.org/10.1007/978-981-13-1747-7_25
Ali, F., et al., Transportation sentiment analysis using word embedding and ontology-based topic modeling. KBS. 2019; 174: p. 27-42. https://doi.org/10.1016/j.knosys.2019.02.033
Atzeni, M. and D.R. Recupero, Multi-domain sentiment analysis with mimicked and polarized word embeddings for human–robot interaction. Future Gener. Comput. Syst. . 2020; 110: p. 984-999. https://doi.org/10.1016/j.future.2019.10.012
Dessí, D., et al., Deep learning adaptation with word embeddings for sentiment analysis on online course reviews. in deep learning-based approaches for sentiment analysis. 2020; Springer. p. 57-83. https://doi.org/10.1007/978-981-15-1216-2_3
Kaibi, I., E.H. Nfaoui, and H. Satori, Sentiment analysis approach based on combination of word embedding techniques. in Embedded Systems and Artificial Intelligence. 2020; Springer. p. 805-813.https://doi.org/10.1007/978-981-15-0947-6_76
Birjali, M., M. Kasri, and A. Beni-Hssane, A comprehensive survey on sentiment analysis: Approaches, challenges and trends. KBS. 2021; 226: p. 107134. https://doi.org/10.1016/j.knosys.2021.107134
Yadav, A. and D.K. Sentiment analysis using deep learning architectures: a review. Artif. Intell. Rev. . 2020; 53(6): p. 4335-4385. https://doi.org/ 10.1007/s10462-019-09794-5
Rao, G., et al., LSTM with sentence representations for document-level sentiment classification. Neurocomputing. 2018; 308: p. 49-57. https://doi.org/10.1016/j.neucom.2018.04.04
Cheng, Y., et al., Text sentiment orientation analysis based on multi-channel CNN and bidirectional GRU with attention mechanism. IEEE Access. 2020; 8: p. 134964134975. https://doi.org/10.1109/ACCESS.2020.3005823
Abid, F., et al., Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Future Gener. Comput. Syst.2019; 95: p. 292-308. https://doi.org/10.1016/j.future.2018.12.018
Ghulam, H., et al., Deep learning-based sentiment analysis for roman urdu text. Procedia Comput. Sci. . 2019; 147: p. 131-135. https://doi.org/10.1016/j.procs.2019.01.202
Guo, S., et al., Improved SMOTE algorithm to deal with imbalanced activity classes in smart homes. Neural Process. Lett. . 2019;50(2): p. 1503-1526. https://doi.org/10.1007/s11063-018-9940-3
Chandio, B., et al., Sentiment Analysis of Roman Urdu on E-Commerce Reviews Using Machine Learning. CMES-Comput. Model. Eng. Sci, 2022. https://doi.org/10.32604/cmes.2022.019535
Kamyab, M., G. Liu, and M. Adjeisah, Attention-based CNN and Bi-LSTM model based on TF-IDF and glove word embedding for sentiment analysis. Appl. Sci. . 2021; 11(23): p. 11255. https://doi.org/10.3390/app112311255
Xu, G., et al., Sentiment analysis of comment texts based on BiLSTM. Ieee Access. 2019; 7: p. 51522- 51532. https://doi.org/10.1109/ACCESS.2019.2909919
Chandio, B.A., et al., Attention-based RU-BiLSTM sentiment analysis model for roman Urdu. Appl. Sci. . 2022; 12(7): p. 3641. https://doi.org/10.3390/app12073641
Khan, L., et al., Deep sentiment analysis using CNN-LSTM architecture of English and Roman Urdu text shared in social media. Appl. Sci. . 2022; 12(5): p. 2694. https://doi.org/10.3390/app12052694
Chandio, B., et al., Sentiment Analysis of Roman Urdu on E-Commerce Reviews Using Machine Learning. CMES-Comput. Model. Eng. Sci. 2022. https://doi.org/10.32604/cmes.2022.019535
Mahmood, Z., et al., Deep sentiments in roman urdu text using recurrent convolutional neural network model. Information Processing & Management.2020; 57(4): p. 102233. https://doi.org/10.1016/j.ipm.2020.102233.
Uysal, A.K. and S. Gunal, The impact of preprocessing on text classification. Inf. Process. Manage. .2014; 50(1): p. 104-112. https://doi.org/10.1016/j.ipm.2013.08.006
Khan L, Amjad A, Afaq KM, Chang HT. Deep sentiment analysis using CNN-LSTM architecture of English and Roman Urdu text shared in social media. Appl. Sci. . 2022 Mar 4;12(5):2694. https://doi.org/10.3390/app12052694
Mehmood, K., Essam, D., Shafi, K., & Malik, M. K.. Sentiment analysis for a resource poor language—Roman Urdu. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP). 2019;19(1): 1-15. https://doi.org/10.1145/3329709