Data Mining Techniques for Iraqi Biochemical Dataset Analysis

Sarah  Sameer; Suhad Faisal  Behadili

doi:10.21123/bsj.2022.19.2.0385

PDF

Published: Apr 1, 2022

DOI: https://doi.org/10.21123/bsj.2022.19.2.0385

Keywords:

Biomedical, Classification And Regression Tree (CART), Data mining, Hierarchical clustering, K-means.

Sarah Sameer

Computer Science Department, College of Science, University of Baghdad, Baghdad, Iraq

Suhad Faisal Behadili

Computer Science Department, College of Science, University of Baghdad, Baghdad, Iraq

Abstract

This research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB), and Support Vector Machine (SVM) techniques. CART gives clear results with high accuracy between the six supervised algorithms. It is worth noting that the preprocessing steps take remarkable efforts to handle this type of data, since its pure data set has so many null values of a ratio 94.8%, then it becomes 0% after achieving the preprocessing steps. Then, in order to apply CART algorithm, several determined tests were assumed as classes. The decision to select the tests which had been assumed as classes were depending on their acquired accuracy. Consequently, enabling the physicians to trace and connect the tests result with each other, which extends its impact on patients’ health.

Received 13/7/2020

Accepted 19/1/2021

Published Online First 20/9/2021

How to Cite

Data Mining Techniques for Iraqi Biochemical Dataset Analysis. Baghdad Sci.J [Internet]. 2022 Apr. 1 [cited 2025 Mar. 7];19(2):0385. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/5407

Issue

Vol. 19 No. 2 (2022): Issue 2

Section

article

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Download Citation

References

Behadili SF, Abd MS, Mohammed IK, Al-SAYYID MM. Breast cancer decisive parameters for Iraqi women via data mining techniques. JOCMS. 2019 Apr 19;5(2).

Nilashi M, Ibrahim O, Dalvi M, Ahmadi H, Shahmoradi L. Accuracy improvement for diabetes disease classification: a case on a public medical dataset. Fuzzy Inf. Eng. 2017 Sep 1;9(3):345-57. DOI: https://doi.org/10.1016/j.fiae.2017.09.006

Huang Y, McCullagh P, Black N, Harper R. Feature selection and classification model construction on type 2 diabetic patients’ data. Artif Intell Med. 2007 Nov 1;41(3):251-62. DOI: 10.1016/j.artmed.2007.07.002

Li J, Fu AW, Fahey P. Efficient discovery of risk patterns in medical data. Artif Intell Med. 2009 Jan 1;45(1):77-89. DOI: 10.1136/svn-2017-000101

Wasan SK, Bhatnagar V, Kaur H. The impact of data mining techniques on medical diagnostics. Data Sci. J. 2006;5:119-26. DOI: http://doi.org/10.2481/dsj.5.119

Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: Diabetes health care in young and old patients. JKSUCI. 2013 Jul1;25(2): 127-36. https://doi.org/10.1016/j.jksuci.2012.10.003

Salcedo-Bernal A, Villamil-Giraldo MP, Moreno-Barbosa AD. Clinical data analysis: An opportunity to compare machine learning methods. Procedia Comput Sci. 2016 Jan 1;100(100):731-8. DOI: 10.1016/j.procs.2016.09.218

Diwani SA, Yonah ZO. A novel holistic disease prediction tool using best fit data mining techniques. IJCDS. 2017 Mar 1;6(02):63-72. DOI: http://dx.doi.org/10.12785/IJCDS/060202

Mustafa TK, Abd MS. Proposed approach for analysing general hygiene information using various data mining algorithms. IJS. 2017;58(1B):337-44.

Crook M. Clinical biochemistry and metabolic medicine. 8th ed. London. CRC Press, 2012. DOI https://doi.org/10.1201/b13295

Drab K, Daszykowski M. Clustering in analytical chemistry. J AOAC Int. 2014 Jan 1;97(1):29-38. DOI:https://doi.org/10.5740/jaoacint.SGEDrab

Han J, Kamber M, Pei J. Data mining concepts and techniques. 3rd ed. Elsevier; 2011 Jun 9.

Müller AC, Guido S. Introduction to machine learning with Python: a guide for data scientists. " O'Reilly Media, Inc."; 2016 Sep 26.

Li M. Application of CART decision tree combined with PCA algorithm in intrusion detection. In2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS) 2017 Nov 24 (pp. 38-41). IEEE. DOI:10.1109/ICSESS.2017.8342859

CS-IF

2.0

CiteScore

1.2

Impact Factor

Make a Submission

issn

P-ISSN: 2078-8665 | E-ISSN: 2411-7986

journalindexing

Journal Indexing
SCOPUS
Directory of Open Access Journals DOAJ
Library of Congress
Iraqi Academic Scientific Journal
Open Access Scholarly Publishers Association (OASPA)
SNIP (Source Normalized Impact Per Paper)

journalinfo

Journal Info
Journal: Baghdad Science Journal
Publisher: College of Science for Women/ University of Baghdad
Baghdad Sci. J. is peer-reviewed and open access
Print ISSN: 2078-8665
Electronic ISSN: 2411-7986
Publishing Frequency: Quarterly (from 2004 - 2021) Bi-monthly (from 2022) Monthly (from 2024)
Launched Date: 2004
Abbreviation: Baghdad Sci.J.
Each published paper in Baghdad Sci. J. has a digital object identifier (DOI) number

Language

scopus

1.3

2022CiteScore

50th percentile

ca

cope

sjr

locongress

clockss

Ithenticate

Sherpa Romeo

crossref

WHO

sci journal

uob digital repository

Scilit

cc

© 2022 The Author(s). Published by College of Science for Women, University of Baghdad. This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Article Sidebar

Main Article Content

Abstract

Article Details

How to Cite

References