Using Graph Mining Method in Analyzing Turkish Loanwords Derived from Arabic Language
Keywords:Arabic language, Data mining, Graph mining, Loanwords, Turkish language
Loanwords are the words transferred from one language to another, which become essential part of the borrowing language. The loanwords have come from the source language to the recipient language because of many reasons. Detecting these loanwords is complicated task due to that there are no standard specifications for transferring words between languages and hence low accuracy. This work tries to enhance this accuracy of detecting loanwords between Turkish and Arabic language as a case study. In this paper, the proposed system contributes to find all possible loanwords using any set of characters either alphabetically or randomly arranged. Then, it processes the distortion in the pronunciation, and solves the problem of the missing letters in Turkish language relative to Arabic language. A graph mining technique was introduced, for identifying the Turkish loanwords from Arabic language, which is used for the first time for this purpose. Also, the problem of letters differences, in the two languages, is solved by using a reference language (English) to unify the style of writing. The proposed system was tested using 1256 words that manually annotated. The obtained results showed that the f-measure is 0.99 which is high value for such system. Also, all these contributions lead to decrease time and effort to identify the loanwords in efficient and accurate way. Moreover, researchers do not need to have knowledge in the recipient and the source languages. In addition, this method can be generalized to any two languages using the same steps followed in obtaining Turkish loanwords from Arabic.
Published Online First 20/5/2022
Fernandes E, Holanda M, Victorino M, Borges V, Carvalho R, Van Erven G. Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. J Bus Res. 2019 Jan 1;94: 335-43. https://doi.org/10.1016/j.jbusres.2018.02.012
Miao, Cai An, Tan Shi. Application of Data Mining Techniques on Tourist Expenses in Malaysia. Baghdad Sci.J. 2021. 18; 1: 737-745.
Kantardzic M. Data mining: concepts, models, methods, and algorithms. 2nd edition John Wiley & Sons; 2011 Aug 16.
Bacciu D, Micheli A, Podda M. Edge-based sequential graph generation with recurrent neural networks. Neurocomputing. 2020 Nov 27; 416: 177-89. https://doi.org/10.1016/j.neucom.2019.11.112.
Yuan W, He K, Guan D, Zhou L, Li C. Graph kernel based link prediction for signed social networks. Inf Fusion. 2019 Mar 1; 46: 1-0. https://doi.org/10.1016/j.inffus.2018.04.004
Priya A, Sinha K, Darshani MP, Sahana SK. A novel multimedia encryption and decryption technique using binary tree traversal. InProceeding of the Second International Conference on Microelectronics, Computing & Communication Systems (MCCS 2017) Springer, Singapore. 2019: 163-178. https://doi.org/10.1007/978-981-10-8234-4_15
Fournier‐Viger P, He G, Cheng C, Li J, Zhou M, Lin JC, et al. A survey of pattern mining in dynamic graphs. Wiley Interdisciplinary Reviews: KDD. 2020 Nov;10(6): e1372. https://doi.org/10.1002/widm.1372
Yan X, Han J, Discovery of frequent substructures.In: Cook DJ, Holder LB, Mining graph data. John Wiley
& Sons; 2006 Dec 18. p 99-113.
Bahumaid S. Lexical borrowing: The case of English loanwords in Hadhrami Arabic. IJLL. 2015 Dec;2; 6:13-24.
Pulcini V, Furiassi C, Rodríguez González F. The lexical influence of English on European languages. The anglicization of European lexis. John Benjamins Publishing Company. 2012. p.1-24. https://doi.org/10.1075/z.174.03pul
Hock HH, Joseph BD. Language history, language change, and language relationship: An introduction to historical and comparative linguistics.. Walter de Gruyter GmbH & Co KG; 3rd ed. 2019 Sep 2. https://doi.org/10.1515/9783110613285.
Metz HC, From autonomy to occupation: Ismail, Taqfiq, and the Urabi revolt. Egypt H. A Country Study. In GPO for the Library of Congress: Washington, DC, USA 1990. P 35-37.
Salman YM, Mansour MS. English Loanwords in Iraqi Arabic with Reference to Computer, Internet and Mobile Phone Jargon.‘. Cihan Univ. Erbil Scij. 2017; 1: 271-94. https://doi.org/10.24086/cuesj.v1n1a14
Peperkamp S, Dupoux E. Loanword adaptations: Three problems for phonology and a psycholinguistic solution. Unpublished manuscript, Laboratoire de Sciences Cognitives et Pscyholinguistique, Paris & Universite de Paris. 2001; 8: 1-2.
Peperkamp S A. psycholinguistic theory of loanword adaptations. Annual Meeting of the Berkeley Linguistics Society 2004 Jun 25; 30; 1: 341-352.
Rao CS. The Significance of the Words Borrowed Into English Language. J res scholars prof Engl lang teach. 2018;2(6): 1-9.
Farazandeh-pour F, Kord Zafaranlu Kambuziya A. German Loanwords Adaptation in Persian: Optimality Approach. Int j humanit. 2013 Oct 10; 20; 4:23-40.
Mi C, Yang Y, Zhou X, Wang L, Li X, Jiang T. Recurrent neural network based loanwords identification in Uyghur. In Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Oral Papers. Paclic 30 Proceedings; 2016 Oct :209-217.
Koo H. An unsupervised method for identifying loanwords in Korean. Lang. Resour. Eval. 2015 Jun;49; 2: 355-73. https://doi.org/10.1007/s10579-015-9296-5
Miller JE, Tresoldi T, Zariquiey R, Beltrán Castañón CA, Morozova N, List JM. Using lexical language models to detect borrowings in monolingual wordlists. Plos one. 2020 Dec 9; 15. 12: e0242709. https://doi.org/10.1371/journal.pone.0242709.
Zhang L, Manni F, Fabri R, Nerbonnei J. Detecting loan words computationally. Variation Rolls the Dice. A Worldwide Collage in Honour of Salikoko S. Mufwene. John Benjamins Publishing Company. 2021 Oct; 15. https://doi.org/10.1075/coll.59.11zha.
- Farghaly A, Shaalan K. Arabic natural language processing: Challenges and solutions. ACM Transactions on Asian Language Information Processing. 2009; 8(4): 1-22.
Copyright (c) 2022 Baghdad Science Journal
This work is licensed under a Creative Commons Attribution 4.0 International License.