JASBO: Jaya Average Subtraction Based Optimization with Deep Learning Model for Multi-Classification of Infectious Disease from Unstructured Data

Authors

  • Vian Sabeeh Informatics Department, Technical College of Management-Baghdad, Middle Technical University, Baghdad Iraq. https://orcid.org/0000-0002-0860-2335
  • Ahmed Bahaaulddin A. Alwahhab Informatics Department, Technical College of Management-Baghdad, Middle Technical University, Baghdad Iraq. https://orcid.org/0000-0003-0965-4812
  • Ali Abdulmunim Ibrahim Al-kharaz Informatics Department, Technical College of Management-Baghdad, Middle Technical University, Baghdad Iraq. https://orcid.org/0000-0002-7321-2296

DOI:

https://doi.org/10.21123/bsj.2024.9184

Keywords:

Average and Subtraction-Based Optimizer (ASBO), Bidirectional-LSTM (Bi-LSTM), Convolutional Neural Network (CNN), Infectious Disease Network (ID-Net), Jaya algorithm.

Abstract

Infectious diseases have become an unavoidable big trouble in today's environment with a similar symptomatology that makes difficult of early detection and clear separation of infection. Hence, it is required to generate a new technique that best utilizes the various symptomatologies present in the illnesses for its multi-classification. Medical documents are considered an essential source for modern, invented, and robust analysis methods for accurate infection diagnoses. Accordingly, enriching medical text processing is beneficial in health informatics. In this research, proposed Jaya Average Subtraction Based Optimization (JASBO), which is enabled by Deep Learning (DL) is used to classify infectious diseases into many categories from unstructured data. Moreover, the DL model used is Infectious Disease Network (ID-Net) which combines Convolutional Neural Network (CNN) and Bidirectional-Long Short-Term Memory (Bi-LSTM). To specify the strange or discriminative words with BI-LSTM . JASBO algorithm used in the model to determine the size of the filter in the final classification network to detect the meaningful part of the text. The input text is given to the Tokenization layer in this case, where the tokens get formed and is forwarded to CNN. Additionally, character-based network features are extracted using Bi-LSTM model. Then, vector representation is concatenated with two separate character-level extractions from Bi-LSTM and CNN. Character level features are passed to the attention layer, which uses the Kumar-Hassebrook similarity measure to calculate the score function. Label of each word token is then predicted by the ID layer, at which layer size is found by JASBO. Here, JASBO combines Jaya algorithm with an Average and Subtraction-Based Optimizer (ASBO). The best performance of JASBO_ID-Net is analyzed with three performance metrics: accuracy with superior value of 91%, recall with high value of 88.7%, and F-measure with a superior value of 90%.

References

Assale M, Dui LG, Cina A, Seveso A, Cabitza F. The revival of the notes field: leveraging the unstructured content in electronic health records. Front. Med. 2019 17;6:66. https://doi.org/10.3389/fmed.2019.00066.

Li I, Pan J, Goldwasser J, Verma N, Wong WP, Nuzumlalı MY, Rosand B, Li Y, Zhang M, Chang D, Taylor RA. Neural Natural Language Processing for unstructured data in electronic health records: A review. Comput. Sci. Rev. 2022; 46:100511. https://doi.org/10.1016/j.cosrev.2022.100511.

Ali F, El-Sappagh S, Islam SR, Kwak D, Ali A, Imran M, Kwak KS. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. inf. Fusion. 2020; 63:208-22. https://doi.org/10.1016/j.inffus.2020.06.008

Yuan Q, Cai T, Hong C, Du M, Johnson BE, Lanuti M, Cai T, Christiani DC. Performance of a machine learning algorithm using electronic health record data to identify and estimate survival in a longitudinal cohort of patients with lung cancer. JAMA netw. Open. 2021; 4(7):e2114723-. https://doi.org/10.1001/jamanetworkopen.2021.14723

Arji G, Ahmadi H, Nilashi M, Rashid TA, Ahmed OH, Aljojo N, Zainol A. Fuzzy logic approach for infectious disease diagnosis: A methodical evaluation, literature and classification. BBE. 2019;39(4):937-55. https://doi.org/10.1016/j.bbe.2019.09.004

Moyo E, Mhango M, Moyo P, Dzinamarira T, Chitungo I, Murewanhema G. Emerging infectious disease outbreaks in Sub-Saharan Africa: Learning from the past and present to be better prepared for future outbreaks. Front. Public Health. 2023; 11:1049986. https://doi.org/10.3389/fpubh.2023.1049986

Bashir MF, Ma B, Shahzad L. A brief review of socio-economic and environmental impact of Covid-19. Air Qual. Atmos. Health. 2020; 13:1403-9. https://doi.org/10.1007/s11869-020-00894-8

Naz R, Gul A, Javed U, Urooj A, Amin S, Fatima Z. Etiology of acute viral respiratory infections common in Pakistan: A review. Rev. Med. Virol. 2019;29(2):e2024 https://doi.org/10.1002/rmv.2024

Wang M, Wei Z, Jia M, Chen L, Ji H. Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records. BMC Med. Inform. Decis. Mak. 2022 ;22(1):1 https://doi.org/10.1186/s12911-022-01776-y

Luo X, Gandhi P, Zhang Z, Shao W, Han Z, Chandrasekaran V, Turzhitsky V, Bali V, Roberts AR, Metzger M, Baker J. Applying interpretable deep learning models to identify chronic cough patients using EHR data. Comput. Methods Programs Biomed. 2021; 210:106395. https://doi.org/10.1016/j.cmpb.2021.106395.

Vidhya K, Shanmugalakshmi R. Deep learning based big medical data analytic model for diabetes complication prediction. JAIHC. 2020; 11:5691-702. https://doi.org/10.1007/s12652-020-01930-2.

Wang SM, Chang YH, Kuo LC, Lai F, Chen YN, Yu FY, Chen CW, Li ZW, Chung Y. Using deep learning for automatic ICD-10 classification from free-text data. EJBI. 2020;16(1).

Zhao J, Yu L, Liu Z. Research based on multimodal deep feature fusion for the Auxiliary diagnosis model of Infectious Respiratory diseases. Sci. Program. 2021; 2021:1-6. https://doi.org/10.1155/2021/5576978.

Maheshwari V, Mahmood MR, Sravanthi S, Arivazhagan N, ParimalaGandhi A, Srihari K, Sagayaraj R, Udayakumar E, Natarajan Y, Bachanna P, Sundramurthy VP. Nanotechnology-based sensitive biosensors for COVID-19 prediction using fuzzy logic control. J. Nanomater. 2021; 2021:1-8. https://doi.org/10.1155/2021/3383146.

Venkataraman GR, Pineda AL, Bear Don’t Walk IV OJ, Zehnder AM, Ayyar S, Page RL, Bustamante CD, Rivas MA. FasTag: Automatic text classification of unstructured medical narratives. PLoS one. 2020; 15(6):e0234647. https://doi.org/10.1371/journal.pone.0234647.

Nagamine T, Gillette B, Pakhomov A, Kahoun J, Mayer H, Burghaus R, Lippert J, Saxena M. Multiscale classification of heart failure phenotypes by unsupervised clustering of unstructured electronic medical record data. Sci. Rep.. 2020; 10(1):1-3. https://doi.org/10.1038/s41598-020-77286-6.

Ahmad A, Ullah A, Feng C, Khan M, Ashraf S, Adnan M, Nazir S, Khan HU. Towards an improved energy efficient and end-to-end secure protocol for IoT healthcare applications. Secur. Commun. Netw.. 2020; 2020:1-0. https://doi.org/10.1155/2020/8867792.

Ashraf S, Ahmed T, Aslam Z, Muhammad D, Yahya A, Shuaeeb M. Depuration‎ based Efficient Coverage Mechanism for‎ Wireless Sensor Network . J. Electr. Comput. Eng. Innovations. 2020; 8(2):145-60. https://doi.org/10.22061/jecei.2020.6874.344.

Ashraf S, Saleem S, Chohan AH, Aslam Z, Raza A. Challenging strategic trends in green supply chain management. Int. J. Res. Eng. Appl. Sci. JREAS. 2020; 5(2):71-4. https://doi.org/10.46565/jreas.2020.v05i02.006

Dehghani M, Hubálovský Š, Trojovský P. A new optimization algorithm based on average and subtraction of the best and worst members of the population for solving various optimization problems. PeerJ Comput. Sci.. 2022 ;8:e910. https://doi.org/10.7717/peerj-cs.910.

Venkata Rao R, Venkata Rao R. Jaya optimization algorithm and its variants. Jaya: An advanced optimization algorithm and its engineering applications. 2019:9-58. https://doi.org/10.1007/978-3-319-78922-4_2

MeDAL dataset , “https://www.kaggle.com/datasets/xhlulu/medal-emnlp”, accessed on January 2023.

Sugave S, Jagdale B. Monarch-EWA: Monarch-earthworm-based secure routing protocol in IoT. Comput J. 2020; 63(6):817-31. https://doi.org/10.1093/comjnl/bxz135.

Hasan AM, Qasim AF, Jalab HA, Ibrahim RW. Breast Cancer MRI Classification Based on Fractional Entropy Image Enhancement and Deep Feature Extraction. Baghdad Sci. J. 2022; 20(1) :0221- 234. https://doi.org/10.21123/bsj.2022.6782

Li W, Qi F, Tang M, Yu Z. Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing. 2020; 387:63-77. https://doi.org/10.1016/j.neucom.2020.01.006

Kumar-Hassebrook similarity measure , https://drostlab.github.io/philentropy/reference/distance.html.

Wang SH, Muhammad K, Hong J, Sangaiah AK, Zhang YD. Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput. Appl. 2020; 32:665-80. https://doi.org/10.1007/s00521-018-3924-0.

Cho M, Ha J, Park C, Park S. Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition. J. Biomed. Inform.. 2020; 103:103381. https://doi.org/10.1016/j.jbi.2020.103381.

Wotaifi TA, Dhannoon BN. An Effective Hybrid Deep Neural Network for Arabic Fake News Detection. Baghdad Sci. J. 2023;20(4): https://doi.org/10.21123/bsj.2023.7427

Harris CR, Millman KJ, Van Der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R. Array programming with NumPy. Nature. 2020; 585(7825):357-62. https://doi.org/10.1038/s41586-020-2649-2

Keras: Deep Learning for humans . [cited 2023 Aug 11]. https://keras.io/

Scikit-Learn: machine learning in Python — scikit-learn 1.3.0 documentation [Internet]. [cited 2023 Aug 11]. Available from: https://scikit-learn.org/stable/

Wen Z, Lu XH, Reddy S. MeDAL: medical abbreviation disambiguation dataset for natural language understanding pretraining. arXiv preprint arXiv:2012.13978. 2020. https://doi.org/10.48550/arXiv.2012.13978

Downloads

Issue

Section

article

How to Cite

1.
JASBO: Jaya Average Subtraction Based Optimization with Deep Learning Model for Multi-Classification of Infectious Disease from Unstructured Data. Baghdad Sci.J [Internet]. [cited 2024 Apr. 30];21(10). Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/9184