Human Pose Estimation Algorithm Using Optimized Symmetric Spatial Transformation Network

Main Article Content

Shengqing Lin
https://orcid.org/0009-0002-9088-1384
Nor Azizah Ali
https://orcid.org/0000-0003-2565-3836
Azlan bin Mohd Zain
https://orcid.org/0000-0003-2004-3289
Muhalim Mohamed Amin Amin

Abstract

Human posture estimation is a crucial topic in the computer vision field and has become a hotspot for research in many human behaviors related work. Human pose estimation can be understood as the human key point recognition and connection problem. The paper presents an optimized symmetric spatial transformation network designed to connect with single-person pose estimation network to propose high-quality human target frames from inaccurate human bounding boxes, and introduces parametric pose non-maximal suppression to eliminate redundant pose estimation, and applies an elimination rule to eliminate similar pose to obtain unique human pose estimation results. The exploratory outcomes demonstrate the way that the proposed technique can precisely recognize the human central issues, really work on the exactness of human posture assessment, and can adjust to the intricate scenes with thick individuals and impediment. Finally, the difficulties and possible future trends are described, and the development of the field is presented.

Article Details

How to Cite
1.
Human Pose Estimation Algorithm Using Optimized Symmetric Spatial Transformation Network. Baghdad Sci.J [Internet]. 2024 Feb. 25 [cited 2025 Jan. 20];21(2(SI):0755. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/9775
Section
article

How to Cite

1.
Human Pose Estimation Algorithm Using Optimized Symmetric Spatial Transformation Network. Baghdad Sci.J [Internet]. 2024 Feb. 25 [cited 2025 Jan. 20];21(2(SI):0755. Available from: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/9775

References

Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, et al. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing. 2020; 408(2): 189-215. https://doi.org/10.1016/j.neucom.2019.10.118.

Alzubaidi L, Zhang J, Humaidi A J, et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data. 2021; 8(1): 1-74. https://doi.org/10.1186/ s40537-021-00444-8.

Pareek P, Thakkar A. A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif. Intell. Rev. .2021;54(3): 2259-2322. https://doi.org 10.1007/s10462-020-09904-8.

Shi Y, Zhang Z, Huang K, et al. Human-computer interaction based on face feature localization. J. Vis. Commun. .2020; 70(1): 102740. https://doi.org/10.1016/j.jvcir.2019. 102740.

Zheng C, Wu W, Yang T, et al. Deep learning-based human pose estimation: A survey. arXiv. arXiv:2012;13392.https://doi.org/10.48550/arXiv.2012.13392.

Chen J, Li S, Liu D, et al. Indoor camera pose estimation via style‐transfer 3D models. COMPUT-AIDED CIV INF . 2022;37(3): 335-353. https://doi.org/10.1111/mice.12714.

Li M, Gao Y, Sang N. Exploiting learnable joint groups for hand pose estimation Proceedings of the AAAI Conference on Artificial Intelligence. 2021; 35(3): 1921-1929 https://doi.org/10.1609/aaai.v35i3.16287.

Tang H, Wang Q, Chen H. Research on 3D human pose estimation using RGBD camera 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC). IEEE, 2019: 538-541. https://doi.org/ 10.1109/iceiec.2019.8784591

9. Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S. Huang, and Lei Zhang. Higherhrnet: Scale aware representation learning for bottom-up human pose estimation. arXiv .2020. https://doi.org/10.48550/arXiv.1908.10357.

Jin S, Liu W, Xie E, et al. Differentiable hierarchical graph grouping for multi-person pose estimation. European Conference on Computer Vision. arXiv. 2020; 718-734. https://doi.org/10.48550/arXiv.2007.11864.

Bao Q, Liu W, Cheng Y, et al. Pose-guided tracking-by-detection: Robust multi-person pose tracking[J]. IEEE Transactions on Multimedia. 2020; 23(10): 161-175. https://doi.org/10.1109/TMM.2020. 2980194.

Dang Q, Yin J, Wang B, et al. Deep learning based 2d human pose estimation: A survey. Tsinghua Sci Technol. 2019; 24(6): 663-676. https://doi.org/ 10.26599/TST.2018.9010100.

Luvizon D C, Picard D, Tabia H. 2d/3d pose estimation and action recognition using multitask deep learning .Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5137-5146. https://doi.org/ 10.1109/CVPR.2018.00539.

Chen Y, Tian Y, He M. Monocular human pose estimation: A survey of deep learning-based methods[J]. Comput Vis Image Underst. 2020; 192(5): 102897. https://doi.org/ 10.1016/j.cviu.2019.102897

Yang G, Sun D, Jampani V, et al. ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction[J]. Adv. neural inf. process. Syst. 2021; 34..

Qiu S, Zhao H, Jiang N, et al. Multi-sensor information fusion based on machine learning for real applications in human activity recognition: State-of-the-art and research challenges[J]. Information Fusion. 2022; 80(6): 241-26. https://doi.org/ 10.1016/j.inffus.2021.11.006.

Toshev A, Szegedy C. DeepPose: human pose estimation via deep neural networks[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEEPress. 2014. 1653-1660. https://doi.org/10.1109/ CVPR.2014.214.

Li S, Zhang L, Diao X. Deep-learning-based human intention prediction using RGB images and optical flow[J]. J Intell Robot Syst. 2020; 97(1): 95-107. https://doi.org/ 10.1007/s10846-019-01049-3.

Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. Cascaded pyramid network for multi-person pose estimation.2018 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR). 2018; 7103– 7112. https://doi.org/ 10.1109/CVPR. 2018.00742.

Wei SE, Ramakrishna V, Kanade T, Sheikh Y. Convolutional pose machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2433-2454..

Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019 ; 5693-5703. https://doi.org/10.1109/CVPR.2019.00584.

Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, et al. Learning delicate Local Representations for Multi-person Pose Estimation. In European Conference on Computer Vision(ECCV). 2020; 457-472. https://doi.org/10.1109/CVPR.2019.00584.

M. Rajchl et al. DeepCut: Object Segmentation From Bounding Box Annotations Using Convolutional Neural Networks. IEEE Transactions on Medical Imaging. 2017; 36(2).674-683. https://doi.org/10. 1109/ TMI.2016.2621185.

Cao Z,Simon T,WeiS H, et al. Real time multiperson 2D pose estimation using part affinity fields. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) IEEE Press. 2017; 1302-1310. https://doi.org/10.1109/TPAMI.2020.2983686.

Newell A, Yang KY, Deng J. Stacked hourglass networks for human pose estimation. Computer Vision - ECCV 2016. Lecture Notes in Computer Science. 2016 ; 483-499. Available from: https://doi.org/10.1007/978-3-319-46484-8_29.

Miller LE, Fabio C, Azaroual M, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017; 39(6):1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N, et al. Attention is all you need. NeurIPS. 2017; 5998-6008; Yufei Xu, Jing Zhang, Qiming Zhang, Dacheng Tao. ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation. 2022; 38571-38584. https://doi.org/10.48550/arXiv.2212.04246.

Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, et al. SSD: Single shot multibox detector. Computer Vision ECCV(Springer). 2016; 21-37. https://doi.org/10.1007/978-3-319-46448-2_0.

Andriluka M, Pishchulin L, Gehler P, Schiele B. 2D human pose estimation: New benchmark and state of the art analysis. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014; 3985-3978. https://doi.org/10.1109/CVPR.2014.471.