Variable Selection Using aModified Gibbs Sampler Algorithm with Application on Rock Strength Dataset

Main Article Content

Ghadeer J. M. Mahdi
https://orcid.org/0000-0003-4870-4034
Othman M. Salih
https://orcid.org/0000-0002-9908-8748

Abstract

Variable selection is an essential and necessary task in the statistical modeling field. Several studies have triedto develop and standardize the process of variable selection, but it isdifficultto do so. The first question a researcher needs to ask himself/herself what are the most significant variables that should be used to describe a given dataset’s response. In thispaper, a new method for variable selection using Gibbs sampler techniqueshas beendeveloped.First, the model is defined, and the posterior distributions for all the parameters are derived.The new variable selection methodis tested usingfour simulation datasets. The new approachiscompared with some existingtechniques: Ordinary Least Squared (OLS), Least Absolute Shrinkage and Selection Operator (Lasso), and Tikhonov Regularization (Ridge). The simulation studiesshow that the performance of our method is better than the othersaccording to the error and the time complexity. Thesemethodsare applied to a real dataset, which is called Rock StrengthDataset.The new approach implemented using the Gibbs sampler is more powerful and effective than other approaches.All the statistical computations conducted for this paper are done using R version 4.0.3 on a single processor computer.

Article Details

How to Cite
1.
Variable Selection Using aModified Gibbs Sampler Algorithm with Application on Rock Strength Dataset. Baghdad Sci.J [Internet]. 2022 Jun. 1 [cited 2024 Dec. 27];19(3):0551. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/5159
Section
article

How to Cite

1.
Variable Selection Using aModified Gibbs Sampler Algorithm with Application on Rock Strength Dataset. Baghdad Sci.J [Internet]. 2022 Jun. 1 [cited 2024 Dec. 27];19(3):0551. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/5159

References

Bahassine S, Madani A, Al-Sarem M, Kissi M. Feature selection using an improved Chi-square for Arabic text classification. JKSU. 2018 May 24.

Surendiran B, Vadivel A. Feature selection using stepwise ANOVA discriminant analysis for mammogram mass classification. IJRTET. 2010 May;3(2):55-7.

Sutter JM, Kalivas JH. Comparison of forward selection, backward elimination, and generalized simulated annealing for variable selection. MICJ. 1993 Feb 1;47(1-2):60-6.

Pierna JA, Abbas O, Baeten V, Dardenne P. A Backward Variable Selection method for PLS regression (BVSPLS). Analytica chimica acta. 2009 May 29;642(1-2):89-93.

Piepho HP. Ridge regression and extensions for genomewide selection in maize. Crop Science. 2009 Jul 1;49(4):1165-76.

Bair E, Hastie T, Paul D, Tibshirani R. Prediction by supervised principal components. JASA. 2006 Mar 1;101(473):119-37.

Chimisov C, Latuszynski K, Roberts G. Adapting the Gibbs sampler. arXiv preprint arXiv:1801.09299. 2018 Jan 28.

Mandt S, Hoffman MD, Blei DM. Stochastic gradient descent as approximate Bayesian inference. JMLR. 2017 Jan 1;18(1):4873-907.

Mahdi GJ. A Modified Support Vector Machine Classifiers Using Stochastic Gradient Descent with Application to Leukemia Cancer Type Dataset. BSJ. 2020 Dec 1;17(4):1255-1266. DOI: 10.21123/bsj.2020.17.4.1255.

Al-Sharea Z. Bayesian Model for Detection of Outliers in Linear Regression with Application to Longitudinal Data. Thesis, 2017.

Syring N, Hong L, Martin R. Gibbs posterior inference on value-at-risk. Scandinavian Actuarial Journal. 2019 Aug 9;2019(7):548-57.

Zhang Q, Mahdi G, Tinker J, Chen H. A graph-based multi-sample test for identifying pathways associated with cancer progression. Computational Biology and Chemistry. 2020 May 26:107285. DOI: 10.1016/j.compbiolchem.2020.107285.

Van Ravenzwaaij D, Cassey P, Brown SD. A simple introduction to Markov Chain Monte–Carlo sampling. PB&R. 2018 Feb 1;25(1):143-54.

Mahdi GJ, Chakraborty A, Arnold ME, Rebelo AG. Efficient Bayesian modeling of large lattice data using spectral properties of Laplacian matrix. Spatial statistics. 2019 Mar 1;29:329-50. DOI: 10.1016/j.spasta.2019.01.003.

Efthymiou C, Hayes TP, Stefankovic D, Vigoda E, Yin Y. Convergence of MCMC and loopy BP in the tree uniqueness region for the hard-core model. SIAM Journal on Computing. 2019;48(2):581-643.