Using VGG Models with Intermediate Layer Feature Maps for Static Hand Gesture Recognition
Keywords:Convolutional Neural Networks, Deep Learning, Hand Gesture Recognition, VGG-16, VGG-19.
A hand gesture recognition system provides a robust and innovative solution to nonverbal communication through human–computer interaction. Deep learning models have excellent potential for usage in recognition applications. To overcome related issues, most previous studies have proposed new model architectures or have fine-tuned pre-trained models. Furthermore, these studies relied on one standard dataset for both training and testing. Thus, the accuracy of these studies is reasonable. Unlike these works, the current study investigates two deep learning models with intermediate layers to recognize static hand gesture images. Both models were tested on different datasets, adjusted to suit the dataset, and then trained under different methods. First, the models were initialized with random weights and trained from scratch. Afterward, the pre-trained models were examined as feature extractors. Finally, the pre-trained models were fine-tuned with intermediate layers. Fine-tuning was conducted on three levels: the fifth, fourth, and third blocks, respectively. The models were evaluated through recognition experiments using hand gesture images in the Arabic sign language acquired under different conditions. This study also provides a new hand gesture image dataset used in these experiments, plus two other datasets. The experimental results indicated that the proposed models can be used with intermediate layers to recognize hand gesture images. Furthermore, the analysis of the results showed that fine-tuning the fifth and fourth blocks of these two models achieved the best accuracy results. In particular, the testing accuracies on the three datasets were 96.51%, 72.65%, and 55.62% when fine-tuning the fourth block and 96.50%, 67.03%, and 61.09% when fine-tuning the fifth block for the first model. The testing accuracy for the second model showed approximately similar results.
Received 28/4/2022, Revised 8/10/2022, Accepted 9/10/2022, Published Online First 20/2/2023
Bragg D, Koller O, Bellard M, Berke L, Boudreault P, Braffort A, et al. Sign language recognition, generation, and translation: An interdisciplinary perspective. The 21st Int Acm Sigac- Cess Conf Comp Access. 2019; 16-31.
Venugopalan A, Reghunadhan R. A Deep Convolutional Neural Network Approach for Static Hand Gesture Recognition. Procedia Comput Sci. 2020; 171: 2353-2361.
Oyedotun OK, Khashman A. Deep learning in vision-based static hand gesture recognition. Neural Comput Appl. 2017; 28: 3941-3951.
Sharma S, Singh S. Vision-based hand gesture recognition using deep learning for the interpretation of sign language. Expert Syst Appl. 2021. 182: 1-12.
Ding I J, Zheng N W, Hsieh M C. Hand gesture intention-based identity recognition using various recognition strategies incorporated with VGG convolution neural network-extracted deep learning features. J Intell Fuzz Syst. 2021; 40: 7775-7788.
Oudah M, Al-Naji A, Chahl J. Hand Gesture Recognition Based on Computer Vision: A Review of Techniques. J Imaging. 2020; 6: 1-29.
Alzohairi R, Alghonaim R, Alshehri W, Aloqeely S. Image based Arabic Sign Language recognition system. Int J Adv Comput Sci Appl. 2018; 9: 185-194.
Suharjito Anderson R, Wiryana F, Ariesta M C, Kusuma G P. Sign Language Recognition Application Systems for Deaf-Mute People: A Review Based on Input-Process-Output. Procedia Comput Sci. 2017; 116: 44- 448.
Asroni A, Ku-Mahamud KR, Damarjati C, Slamat HB. Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network. Baghdad Sci J [Internet]. 2021Jun.20 [cited 2022Sep.11]; 18(2(Suppl.): 0925. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/6213.
Mahmood RAR, Abdi A, Hussin M. Performance Evaluation of Intrusion Detection System using Selected Features and Machine Learning Classifiers. Baghdad Sci.J [Internet]. 2021Jun.20 [cited 2022Sep.11]; 18(2(Suppl.): 0884. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/6210.
Hayani S, Benaddy M, El Meslouhi O, Kardouchi M. Arab Sign Language Recognition with Convolutional Neural Networks. Int Conf Comp Sci Renew Energies. 2019; 1-4.
Saleh Y, Issa GF. Arabic sign language recognition through deep neural networks fine-tuning. Int J Online Biomed Eng. 2020; 16: 71-83.
Alshazly H, Linse C, Barth E, Martinetz T. Ensembles of Deep Learning Models and Transfer Learning for Ear Recognition. Sens. 2019; 19: 1-26.
Chung H Y, Chung Y L, Tsai W F. An Efficient Hand Gesture Recognition System Based on Deep CNN. IEEE Int Conf Ind Technol. 2019; 853-858.
Sokhib T, Whangbo TK. A combined method of skin-and depth-based hand gesture recognition. Int Arab J Inf Technol. 2020; 17: 137-145.
Odartey LK, Huang Y, Asantewaa EE, Agbedanu PR. Ghanaian Sign Language Recognition Using Deep Learning. PRAI 19: Proceedings of the 2019 the International Conference on Pattern Recognition and Artificial Intelligence. 2019; 81-86.
Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv. 2015; 1-14.
Alani AA, Cosma G. ArSL-CNN: A convolutional neural network for Arabic sign language gesture recognition. Indones J Electr Eng Comput Sci. 2021; 22: 1096-1107.
Latif G, Mohammad N, AlKhalaf R, AlKhalaf R, Alghazo J, Khan MA. An Automatic Arabic Sign Language Recognition System based on Deep CNN: An Assistive System for the Deaf and Hard of Hearing. Int J Comput Digit Syst. 2020; 9: 715-724.
Alshomrani S, Aljoudi L, Arif M. Arabic and American Sign Languages Alphabet Recognition by Convolutional Neural Network. Adv Sci Technol Res J. 2021; 15: 136-148.
Duwwairi RM, Halloush ZA. Automatic recognition of Arabic alphabets sign language using deep learning. Int J Electr Comput Eng. 2022; 12: 2996-3004.
Latif G, Mohammad N, Alghazo J, AlKhalaf R, AlKhalaf R. ArASL: Arabic Alphabets Sign Language Dataset. Data Brief. 2019; 23: 1-4.
Chollet F. Deep Learning with Python. 1 st ed. Shelter Island: Manning Publications. 2018 Chap.5. p. 154.
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. Arxiv. 2016 Mar; Available from: https://arxiv.org/abs/1603.04467.
Alnuaim A, Zakariah M, Hatamleh WA, Tarazi H, Tripathi V, Amoatey ET. Human-Computer Interaction with Hand Gesture Recognition Using ResNet and MobileNet. Comput Intell Neurosci. 2022; 2022: 1-16.
Zakariah M, Alotaibi YA, Koundal D, Guo Y, Elahi MM. Sign Language Recognition for Arabic Alphabets Using Transfer Learning Technique. Comput Intell Neurosci. 2022; 2022: 1-15.
Copyright (c) 2023 Baghdad Science Journal
This work is licensed under a Creative Commons Attribution 4.0 International License.