Enhancing Fuzzy C-Means Clustering with a Novel Standard Deviation Weighted Distance Measure
DOI:
https://doi.org/10.21123/bsj.2024.9516Keywords:
Cluster, Distance measures, FCM, Fuzzy logic, Hybrid algorithmAbstract
The aim of this paper is to present a new approach to address the Fuzzy C Mean algorithm, which is considered one of the most important and famous algorithms that addressed the phenomenon of uncertainty in forming clusters according to the overlap ratios. One of the most important problems facing this algorithm is its reliance primarily on the Euclidean distance measure, and by nature, the situation is that this measure makes the formed clusters take a spherical shape, which is unable to contain complex or overlapping cases. Therefore, this paper attempts to propose a new measure of distance, where we were able to derive a formula for the variance of the fuzzy cluster to be entered as a weight on the Euclidean Distance (WED) formula. Moreover, the calculation was processed partitions matrix through the use of the K-Means algorithm and creating a hybrid environment between the fuzzy algorithm and the sharp algorithm. To verify what was presented, experimental simulation was used and then applied to reality using environmental data for the physical and chemical examination of water testing stations in Basra Governorate. It was proven through the experimental results that the proposed distance measure Weighted Euclidean distance had the advantage over improving the work of the HFCM algorithm through the criterion (Obj_Fun, Iteration, Min_optimization, good fit clustering and overlap) when (c = 2,3) and according to the simulation results, c = 2 was chosen to form groups for the real data, which contributed to determine the best objective function (23.93, 22.44, 18.83) at degrees of fuzzing (1.2, 2, 2.8), while according to the degree of fuzzing (m = 3.6), the objective function for Euclidean Distance (ED) was the lowest, but the criteria were (Iter. = 2, Min_optimization = 0 and ) which confirms that (WED) is the best.Received 18/09/2023
Revised 24/11/2023
Accepted 26/11/2023
Published Online First 20/02/2024
References
Dogruparmak SC, Keskin GA, Yaman S, Alkan A. Using Principal Component Analysis and Fuzzy C–means Clustering for the Assessment of Air Quality Monitoring. Atmos Pollut Res. 2014; 5(4): 656-663. https://doi.org/10.5094/APR.2014.075
Al-Mousa Y, Al-Jasem A, Dahhand ML. Improve the Result of K-Means Algorithms Using Factor Analysis. Res. J. Aleppo Univ. 2015; (16): 1-22. https://www.academia.edu/23149964
Kareem MA, Hamoudi AK, Abdullah AN. Elastic Electron Scattering From 11Li and 12Be Exotic Nuclei in the Framework of the Binary Cluster Model. Iraqi J Sci. 2016; 57(4B): 2664-2676.
Hussein Y, Abdel Jalil S. Proposed KDBSCAN Algorithm for Clustering. Iraqi J Sci. 2018; 59(1A): 173-178. https://doi.org/10.24996/ijs.2018.59.1A.18
Zhao G, Zhang L, Tang C, Hao W, Luo Y. Clustering of AE Signals Collected During Torsional Tests of 3D Braiding Composite Shafts Using PCA and FCM. Compos B Eng. 2019; 161: 547-554. https://doi.org/10.1016/j.compositesb.2018.12.145
Hamed MAR. Application of Surface Water Quality Classification Models Using Principal Components Analysis and Cluster Analysis. J geosci. environ. prot. 2019; 7(6): 26-41. https://doi.org/10.4236/gep.2019.76003
Abbas WA. Genetic Algorithm-Based Anisotropic Diffusion Filter and Clustering Algorithms for Thyroid Tumor Detection. Iraqi J Sci. 2020; 61(5): 1016-1026. https://doi.org/10.24996/ijs.2020.61.5.10
Shiltagh NA, Hussein MA. Data Aggregation in Wireless Sensor Networks Using Modified Voronoi Fuzzy Clustering Algorithm. J Eng. 2015; 21(4): 42-60. https://doi.org/10.31026/j.eng.2015.04.03
Mazhar AN, Naser EF. Hiding the Type of Skin Texture in Mice Based on Fuzzy Clustering Technique. Baghdad Sci J. 2020; 17(3(Suppl.)): 967-972. https://doi.org/10.21123/bsj.2020.17.3(Suppl.).0967
Yaqoob AF, Al-Sarray B. Finding Best Clustering For Big Networks with Minimum Objective Function by Using Probabilistic Tabu Search. Iraqi J Sci. 2019; 60(8): 1837-1845. https://doi.org/10.24996/ijs.2019.60.8.21
Abdul-Samad ST, Kamal S. Image Retrieval Using Data Mining Technique. Iraqi J Sci. 2020; 61(8): 2115-2125. https://doi.org/10.24996/ijs.2020.61.8.26
Yin Y, Sheng Y, Qin J. Interval Type-2 Fuzzy C-means Forecasting Model for Fuzzy Time Series. Appl Soft Comput. 2022 November; 129: 1-7. https://doi.org/10.1016/j.asoc.2022.109574
Mohammed SK, Taha MM, Taha EM, Mohammad MNA. Cluster Analysis of Biochemical Markers as Predictor of COVID-19 Severity. Baghdad Sci J. 2022; 19(6(Suppl.)): 1423-1429. https://doi.org/10.21123/bsj.2022.7454
Khouri L, Al-Mufti MB. Assessment of Surface Water Quality Using Statistical Analysis Methods: Orontes River (Case study). Baghdad Sci J. 2022; 19(5): 981-989. https://doi.org/10.21123/bsj.2022.6262
Nawaz M, Qureshi R, Teevno MA, Shahid AR. Object Detection and Segmentation by Composition of Fast Fuzzy C-mean Clustering Based Maps. J Ambient Intell Humaniz Comput. 2023; 14(6): 7173–7188. https://doi.org/10.1007/s12652-021-03570-6
Setiawan KE, Kurniawan A, Chowanda A, Suhartono D. (Eds.). Clustering Models for Hospitals in Jakarta Using Fuzzy C-means and K-means. Procedia Comput Sci.. 2023; 216: 356–363. https://doi.org/10.1016/j.procs.2022.12.146
Hartigan JA, Wong MA. A K-Means Clustering Algorithm. J R Stat Soc Ser C Appl Stat. 1979; 28(1): 100-108. https://doi.org/10.2307/2346830
Kadhum IJ, Mohammed AS. Classification & Evaluation of Evidence of Deprivation in Iraq (2009) by using Cluster analysis. J Econ Adm Sci. 2015; 21(82): 391-411. https://doi.org/10.33095/jeas.v21i82.630
Ning Z, Chen J, Huang J, Sabo UJ, Yuan Z, Dai Z. WeDIV – An improved k-means clustering algorithm with a weighted distance and a novel internal validation index. Egypt Inform J. 2022; 23(4): 133-144. https://doi.org/10.1016/j.eij.2022.09.002
Ashour MA. Optimum Cost of Transporting Problems with Hexagonal Fuzzy Numbers. J Southwest Jiaotong Univ. 2019; 54(6): 1-7. https://doi.org/10.35741/issn.0258-2724.54.6.10
Arora HD, Naithani AA. New Definition for Quartic Fuzzy Sets with Hesitation Grade Applied to Multi-Criteria Decision-Making Problems Under Uncertainty. Decis. Anal. J. 2023; 7: 1-10. https://doi.org/10.1016/j.dajour.2023.100239
Murfi H, Rosaline N, Hariadi, N. Deep Autoencoder-Based Fuzzy C-means for Topic Detection. Array. 2022; 13: 1-9. https://doi.org/10.1016/j.array.2021.100124
El-Zaghmouri B, Abu-Zanona M. Fuzzy C-Mean Clustering Algorithm Modification and Adaptation for Applications and Adaptation for Applications. WCSIT. 2012; 2(1): 42-45.
Javadi S, Rameez M, Dahl M, Pettersson MI. Vehicle Classification Based on Multiple Fuzzy C-Means Clustering Using Dimensions and Speed Features. Procedia Comput Sci.. 2018; 126: 1344–1350. https://doi.org/10.1016/j.procs.2018.08.085
Hameed SM, Mohammed MB, Attea BA. Fuzzy Based Spam Filtering. Iraqi J Sci. 2015; 56(1B): 506-519.
Goyal LM, Mittal M, Sethi JK. Fuzzy Model Generation Using Subtractive and Fuzzy C-Means Clustering. CSI trans ICT. 2016; 4(2-4): 129–133. https://doi.org/10.1007/s40012-016-0090-3
Oliveira JV, Pedrycz W, editors. Advances in Fuzzy Clustering and its Applications. 1st ed. The Atrium, Southern Gate, Chichester: John Wiley & Sons Ltd; 2007. 454p. https://doi.org/10.1002/9780470061190
Abdulghafoor SA, Mohamed LA. Using Some Metric Distance in Local Density Based on Outlier Detection Methods. J Posit. Psychol. Wellbeing. 2022; 6(1): 189-202.
Ahmad MR, Afzal U. Mathematical Modeling and AI Based Decision Making for COVID-19 Suspects Backed by Novel Distance and Similarity Measures on Plithogenic Hypersoft Sets. Artif Intell Med. 2022; 132: 1-8. https://doi.org/10.1016/j.artmed.2022.102390
Wierzchon ST, Kłopotek MA. Modern Algorithms of Cluster Analysis. 1Ed ed, Springer, Cham; 2018; 34.
Mota VC, Damasceno FA, Leite DF. Fuzzy Clustering and Fuzzy Validity Measures for Knowledge Discovery and Decision Making in Agricultural Engineering. Comput Electron Agric. 2018; 150: 118-124. https://doi.org/10.1016/j.compag.2018.04.011
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Baghdad Science Journal
This work is licensed under a Creative Commons Attribution 4.0 International License.