A Modified Support Vector Machine Classifiers Using Stochastic Gradient Descent with Application to Leukemia Cancer Type Dataset

Main Article Content

Ghadeer Jasim Mahdi


Support vector machines (SVMs) are supervised learning models that analyze data for classification or regression. For classification, SVM is widely used by selecting an optimal hyperplane that separates two classes. SVM has very good accuracy and extremally robust comparing with some other classification methods such as logistics linear regression, random forest, k-nearest neighbor and naïve model. However, working with large datasets can cause many problems such as time-consuming and inefficient results. In this paper, the SVM has been modified by using a stochastic Gradient descent process. The modified method, stochastic gradient descent SVM (SGD-SVM), checked by using two simulation datasets. Since the classification of different cancer types is important for cancer diagnosis and drug discovery, SGD-SVM is applied for classifying the most common leukemia cancer type dataset. The results that are gotten using SGD-SVM are much accurate than other results of many studies that used the same leukemia datasets.


Download data is not yet available.

Article Details

How to Cite
Mahdi GJ. A Modified Support Vector Machine Classifiers Using Stochastic Gradient Descent with Application to Leukemia Cancer Type Dataset. Baghdad Sci.J [Internet]. 2020Dec.1 [cited 2021Jan.25];17(4):1255. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/4283


Bala R, Kumar DD. Classification Using ANN: A Review. IJCIR. 2017;13(7):1811-20.

Okwonu FZ, Othman AR. A Model classification technique for linear discriminant analysis for two groups. IJCSI. 2012 May 1;9(3):125

Barshan E, Ghodsi A, Azimifar Z, Jahromi MZ. Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds. Pattern Recognition. 2011 Jul 1;44(7):1357-71.

Liaw A, Wiener M. Classification and regression by randomForest. R news. 2002 Dec 3;2(3):18-22.

Karim M, Rahman RM. Decision tree and naive bayes algorithm for classification and generation of actionable knowledge for direct marketing. IJSEA. 2013 Apr 25;6(04):196.

Guo G, Wang H, Bell D, Bi Y, Greer K. KNN model-based approach in classification. In OTM Confederated International Conferences. On the Move to Meaningful Internet Systems. 2003 Nov 3 (pp. 986-996). Springer, Berlin, Heidelberg.

Jain R. Simple tutorial on svm and parameter tuning in python and r, 2017. URL https://www. hackerearth. com/blog/machine-learning/simple-tutorial-svm-parameter-tuning-python-r/. Visited. 2018;20.

Mahdi, Ghadeer J. Hierarchical Bayesian Regression with Application in Spatial Modeling and Outlier Detection. Diss. University of Arkansas, Fayetteville, 2018.

Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. 1999 Oct 15;286(5439):531-7.

Ad’hiah AH, Mahmood AS, Al-Kazaz AK, Mayouf KK. Gene Expression and Polymorphism of Interleukin-4 in a Sample of Iraqi Rheumatoid Arthritis Patients. Baghdad Sci. J. 2018;15(2):130-7.

Zhi J, Sun J, Wang Z, Ding W. Support vector machine classifier for prediction of the metastasis of colorectal cancer. Int J Mol Med. 2018 Mar 1;41(3):1419-26.

Mathiasen A, Larsen KG, Grønlund A. Optimal Minimal Margin Maximization with Boosting. InInternational Conference on Machine Learning 2019 May 24 (pp. 4392-4401).

Zararsiz G, Elmali F, Ozturk A. Bagging support vector machines for leukemia classification. IJCSI. 2012 Nov 1;9(6):355.

Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W. Applications of support vector machine (SVM) learning in cancer genomics. GPB. 2018 Jan 1;15(1):41-51.

Thang PQ, Thuy NT, Lam HT. A modification of solution optimization in support vector machine simplification for classification. In Information Systems Design and Intelligent Applications 2018 (pp. 149-158). Springer, Singapore.

Salman AN, Taha TA. On Reliability Estimation for the Exponential Distribution Based on Monte Carlo Simulation. IHJPAS. 2018 Apr 25:409-19.

Tawfiq LN, Rashid TA. On Comparison Between Radial Basis Function and Wavelet Basis Functions Neural Networks. IHJPAS. 2017 May 24;23(2):184-92.

Zaidan AA, Atiya B, Bakar MA, Zaidan BB. A new hybrid algorithm of simulated annealing and simplex downhill for solving multiple-objective aggregate production planning on fuzzy environment. NCA. 2019 Jun 1;31(6):1823-34.

Huang X, Zhang L, Wang B, Li F, Zhang Z. Feature clustering-based support vector machine recursive feature elimination for gene selection. APPL INTELL. 2018 Mar 1;48(3):594-607.

Sopyła K, Drozda P. Stochastic gradient descent with Barzilai–Borwein update step for SVM. Information Sciences. 2015 Sep 20; 316:218-33.

Lopes FF, Ferreira JC, Fernandes MA. Parallel Implementation on FPGA of Support Vector Machines Using Stochastic Gradient Descent. Electronics. 2019 Jun;8(6):631.

Patwary MK, Haque MM. A Semi-Supervised Machine Learning Approach Using K-Means Algorithm to Prevent Burst Header Packet Flooding Attack in Optical Burst Switching Network. Baghdad Sci. J. 2019;16(3 Supplement):804-15.

Pandiyan V, Caesarendra W, Tjahjowidodo T, Tan HH. In-process tool condition monitoring in compliant abrasive belt grinding process using support vector machine and genetic algorithm. J. Manuf. Process. 2018 Jan 1; 31:199-213.

Okwonu FZ, Othman AR. A Model classification technique for linear discriminant analysis for two groups. IJCSI. 2012 May 1;9(3):125.

MacFarland TW, Yates JM. Introduction to nonparametric statistics for the biological sciences using R. Cham: Springer; 2016 Jul 6.

Aytug H. Feature selection for support vector machines using Generalized Benders Decomposition. Eur. J. Oper. Res. 2015 Jul 1;244(1):210-8.