خوارزمية تقدير وضعية الإنسان باستخدام التناظر الأمثل شبكة التحول المكاني
محتوى المقالة الرئيسي
الملخص
يعد تقدير وضعية الإنسان موضوعًا بالغ الأهمية في مجال رؤية الكمبيوتر، وقد أصبح نقطة ساخنة للبحث في العديد من الأعمال المتعلقة بالسلوكيات البشرية. يمكن فهم تقدير وضع الإنسان على أنه مشكلة التعرف على النقاط الرئيسية للإنسان والاتصال بها. تقدم هذه الورقة شبكة تحويل مكاني متماثلة محسنة مصممة للتواصل مع شبكة تقدير وضعية الشخص الواحد لاقتراح إطارات مستهدفة بشرية عالية الجودة من الصناديق المحيطة البشرية غير الدقيقة، وتقدم قمعًا بارامتريًا غير أقصى للقضاء على تقدير الوضعية الزائدة عن الحاجة، وتطبق قاعدة الإزالة لإزالة الوضع المماثل للحصول على نتائج فريدة لتقدير الوضع البشري. توضح النتائج الاستكشافية كيف يمكن للتقنية المقترحة أن تتعرف بدقة على القضايا الإنسانية المركزية، وتعمل حقًا على دقة تقييم وضعية الإنسان، ويمكنها التكيف مع المشاهد المعقدة مع الأفراد السميكين والعوائق. وأخيرا، يتم وصف الصعوبات والاتجاهات المستقبلية المحتملة، ويتم عرض تطور المجال.
Received 30/09/2023
Revised 10/02/2024
Accepted 12/02/2024
Published 25/02/2024
تفاصيل المقالة
هذا العمل مرخص بموجب Creative Commons Attribution 4.0 International License.
كيفية الاقتباس
المراجع
Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, et al. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing. 2020; 408(2): 189-215. https://doi.org/10.1016/j.neucom.2019.10.118.
Alzubaidi L, Zhang J, Humaidi A J, et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data. 2021; 8(1): 1-74. https://doi.org/10.1186/ s40537-021-00444-8.
Pareek P, Thakkar A. A survey on video-based human action recognition: recent updates, datasets, challenges, and applications. Artif. Intell. Rev. .2021;54(3): 2259-2322. https://doi.org 10.1007/s10462-020-09904-8.
Shi Y, Zhang Z, Huang K, et al. Human-computer interaction based on face feature localization. J. Vis. Commun. .2020; 70(1): 102740. https://doi.org/10.1016/j.jvcir.2019. 102740.
Zheng C, Wu W, Yang T, et al. Deep learning-based human pose estimation: A survey. arXiv. arXiv:2012;13392.https://doi.org/10.48550/arXiv.2012.13392.
Chen J, Li S, Liu D, et al. Indoor camera pose estimation via style‐transfer 3D models. COMPUT-AIDED CIV INF . 2022;37(3): 335-353. https://doi.org/10.1111/mice.12714.
Li M, Gao Y, Sang N. Exploiting learnable joint groups for hand pose estimation Proceedings of the AAAI Conference on Artificial Intelligence. 2021; 35(3): 1921-1929 https://doi.org/10.1609/aaai.v35i3.16287.
Tang H, Wang Q, Chen H. Research on 3D human pose estimation using RGBD camera 2019 IEEE 9th International Conference on Electronics Information and Emergency Communication (ICEIEC). IEEE, 2019: 538-541. https://doi.org/ 10.1109/iceiec.2019.8784591
9. Bowen Cheng, Bin Xiao, Jingdong Wang, Honghui Shi, Thomas S. Huang, and Lei Zhang. Higherhrnet: Scale aware representation learning for bottom-up human pose estimation. arXiv .2020. https://doi.org/10.48550/arXiv.1908.10357.
Jin S, Liu W, Xie E, et al. Differentiable hierarchical graph grouping for multi-person pose estimation. European Conference on Computer Vision. arXiv. 2020; 718-734. https://doi.org/10.48550/arXiv.2007.11864.
Bao Q, Liu W, Cheng Y, et al. Pose-guided tracking-by-detection: Robust multi-person pose tracking[J]. IEEE Transactions on Multimedia. 2020; 23(10): 161-175. https://doi.org/10.1109/TMM.2020. 2980194.
Dang Q, Yin J, Wang B, et al. Deep learning based 2d human pose estimation: A survey. Tsinghua Sci Technol. 2019; 24(6): 663-676. https://doi.org/ 10.26599/TST.2018.9010100.
Luvizon D C, Picard D, Tabia H. 2d/3d pose estimation and action recognition using multitask deep learning .Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 5137-5146. https://doi.org/ 10.1109/CVPR.2018.00539.
Chen Y, Tian Y, He M. Monocular human pose estimation: A survey of deep learning-based methods[J]. Comput Vis Image Underst. 2020; 192(5): 102897. https://doi.org/ 10.1016/j.cviu.2019.102897
Yang G, Sun D, Jampani V, et al. ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction[J]. Adv. neural inf. process. Syst. 2021; 34..
Qiu S, Zhao H, Jiang N, et al. Multi-sensor information fusion based on machine learning for real applications in human activity recognition: State-of-the-art and research challenges[J]. Information Fusion. 2022; 80(6): 241-26. https://doi.org/ 10.1016/j.inffus.2021.11.006.
Toshev A, Szegedy C. DeepPose: human pose estimation via deep neural networks[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEEPress. 2014. 1653-1660. https://doi.org/10.1109/ CVPR.2014.214.
Li S, Zhang L, Diao X. Deep-learning-based human intention prediction using RGB images and optical flow[J]. J Intell Robot Syst. 2020; 97(1): 95-107. https://doi.org/ 10.1007/s10846-019-01049-3.
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. Cascaded pyramid network for multi-person pose estimation.2018 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR). 2018; 7103– 7112. https://doi.org/ 10.1109/CVPR. 2018.00742.
Wei SE, Ramakrishna V, Kanade T, Sheikh Y. Convolutional pose machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2433-2454..
Sun K, Xiao B, Liu D, et al. Deep high-resolution representation learning for human pose estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019 ; 5693-5703. https://doi.org/10.1109/CVPR.2019.00584.
Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, et al. Learning delicate Local Representations for Multi-person Pose Estimation. In European Conference on Computer Vision(ECCV). 2020; 457-472. https://doi.org/10.1109/CVPR.2019.00584.
M. Rajchl et al. DeepCut: Object Segmentation From Bounding Box Annotations Using Convolutional Neural Networks. IEEE Transactions on Medical Imaging. 2017; 36(2).674-683. https://doi.org/10. 1109/ TMI.2016.2621185.
Cao Z,Simon T,WeiS H, et al. Real time multiperson 2D pose estimation using part affinity fields. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) IEEE Press. 2017; 1302-1310. https://doi.org/10.1109/TPAMI.2020.2983686.
Newell A, Yang KY, Deng J. Stacked hourglass networks for human pose estimation. Computer Vision - ECCV 2016. Lecture Notes in Computer Science. 2016 ; 483-499. Available from: https://doi.org/10.1007/978-3-319-46484-8_29.
Miller LE, Fabio C, Azaroual M, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell. 2017; 39(6):1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N, et al. Attention is all you need. NeurIPS. 2017; 5998-6008; Yufei Xu, Jing Zhang, Qiming Zhang, Dacheng Tao. ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation. 2022; 38571-38584. https://doi.org/10.48550/arXiv.2212.04246.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, et al. SSD: Single shot multibox detector. Computer Vision ECCV(Springer). 2016; 21-37. https://doi.org/10.1007/978-3-319-46448-2_0.
Andriluka M, Pishchulin L, Gehler P, Schiele B. 2D human pose estimation: New benchmark and state of the art analysis. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014; 3985-3978. https://doi.org/10.1109/CVPR.2014.471.