Generative Adversarial Network for Imitation Learning from Single Demonstration

Tho Nguyen Duc; Chanh Minh Tran; Phan Xuan Tan; Eiji Kamioka

doi:10.21123/bsj.2021.18.4(Suppl.).1350

PDF (الإنجليزية)

منشور: Dec 20, 2021

DOI: https://doi.org/10.21123/bsj.2021.18.4(Suppl.).1350

الكلمات المفتاحية:

التعلم العميق, القليل من التعلم, شبكة الخصومة التوليدية, تعلم التقليد, التعلم دفعة واحدة

Tho Nguyen Duc

كلية الهندسة والعلوم، معهد شيبور للتكنولوجيا، اليابان.

https://orcid.org/0000-0002-8451-2470

Chanh Minh Tran

كلية الهندسة والعلوم، معهد شيبور للتكنولوجيا، اليابان.

Phan Xuan Tan

كلية الهندسة والعلوم، معهد شيبور للتكنولوجيا، اليابان.

Eiji Kamioka

كلية الهندسة والعلوم، معهد شيبور للتكنولوجيا، اليابان.

https://orcid.org/0000-0003-2155-4507

الملخص

التعلم التقليد هو طريقة فعالة لتدريب وكيل مستقل لإنجاز المهمة عن طريق تقليد سلوكيات الخبراء في مظاهراتهم. ومع ذلك، تتطلب طرق التعلم التقليدية التقليدية عددا كبيرا من مظاهرات الخبراء من أجل تعلم سلوك معقد. حدد هذا العيب محدودا إمكانية التعلم التقليد في المهام المعقدة حيث لا تكون مظاهرات الخبراء كافية. من أجل معالجة المشكلة، يقترح النموذج المستند إلى الشبكة المصنوعة من الشبكة المصممة على تصميم سياسات مثالية باستخدام مظاهرة واحدة فقط. يتم تقييم النموذج المقترح على مهمتين محاكاة مقارنة بطرق أخرى. تظهر النتائج أن نموذجنا المقترح قادر على إكمال المهام المدروسة على الرغم من القيد في عدد مظاهرات الخبراء، والذي يشير بوضوح إلى إمكانات نموذجنا.

Received 15/10/2021

Accepted 14/11/2021

كيفية الاقتباس

شبكة الخصومة التوليدية للتعلم التقليد من مظاهرة واحدة. Baghdad Sci.J [انترنت]. 20 ديسمبر، 2021 [وثق 17 مايو، 2024];18(4(Suppl.):1350. موجود في: https://bsj.uobaghdad.edu.iq/index.php/BSJ/article/view/6652

إصدار

مجلد 18 عدد 4(Suppl.) (2021): Supplement Issue 4

القسم

article

هذا العمل مرخص بموجب Creative Commons Attribution 4.0 International License.

كيفية الاقتباس

المراجع

Hussein A, Gaber MM, Elyan E, Jayne C. Imitation learning: A survey of learning methods [Internet]. Vol. 50, ACM Computing Surveys. Association for Computing Machinery; 2017 [cited 2021 May 23]. Available from: https://dl.acm.org/doi/abs/10.1145/3054912

Pan Y, Cheng CA, Saigol K, Lee K, Yan X, Theodorou EA, et al. Imitation learning for agile autonomous driving. Int J Rob Res. 2020 Oct 14;39(2–3):286–302.

Xu Z, Sun Y, Liu M. ICurb: Imitation learning-based detection of road curbs using aerial images for autonomous driving. IEEE Robot Autom Lett. 2021 Apr 1;6(2):1097–104.

Kebria PM, Khosravi A, Salaken SM, Nahavandi S. Deep imitation learning for autonomous vehicles based on convolutional neural networks. IEEE/CAA J Autom Sin. 2020 Jan 1;7(1):82–95.

Doering M, Glas DF, Ishiguro H. Modeling interaction structure for robot imitation learning of human social behavior. IEEE Trans Human-Machine Syst. 2019 Jun 1;49(3):219–31.

Al-Tameemi MI. RMSRS: Rover Multi-purpose Surveillance Robotic System. Baghdad Sci J. 2020 Sep 8;17(3(Suppl.)):1049–1049.

Salimans T, Chen R. Learning Montezuma’s Revenge from a Single Demonstration. 2018 Dec 8 [cited 2021 Jun 14]; Available from: http://arxiv.org/abs/1812.03381

Cai P, Sun Y, Chen Y, Liu M. Vision-Based Trajectory Planning via Imitation Learning for Autonomous Vehicles. In: 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019. Institute of Electrical and Electronics Engineers Inc.; 2019. p. 2736–42.

Ly AO, Akhloufi M. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Trans Intell Veh. 2021 Jun 1;6(2):195–209.

Fernando T, Denman S, Sridharan S, Fookes C. Deep Inverse Reinforcement Learning for Behavior Prediction in Autonomous Driving: Accurate Forecasts of Vehicle Motion. IEEE Signal Process Mag. 2021 Jan 1;38(1):87–96.

Wang Z, Hong T. Reinforcement learning for building controls: The opportunities and challenges. Appl Energy. 2020 Jul 1;269:115036.

Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: A brief survey. IEEE Signal Process Mag. 2017 Nov 1;34(6):26–38.

Pakzad AE, Manuel RM, Uy JS, Asuncion XF, Ligayo JV, Materum L. Reinforcement Learning-Based Television White Space Database. Baghdad Sci J. 2021 Jun 20;18(2(Suppl.)):0947–0947.

Ho J, Ermon S. Generative Adversarial Imitation Learning. In: Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2016.

Zuo G, Chen K, Lu J, Huang X. Deterministic generative adversarial imitation learning. Neurocomputing. 2020 May 7;388:60–9.

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Networks. Commun ACM. 2020 Oct 22;63(11):139–44.

Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, et al. OpenAI Gym. 2016 Jun 5 [cited 2021 Jun 14]; Available from: http://arxiv.org/abs/1606.01540

Barto AG, Sutton RS, Anderson CW. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Trans Syst Man Cybern. 1983;SMC-13(5):834–46.

Liu S, Feng Y, Wu K, Cheng G, Huang J, Liu Z. Graph-Attention-Based Casual Discovery With Trust Region-Navigated Clipping Policy Optimization. IEEE Trans Cybern. 2021 Oct 20;1–14.

Ilboudo WEL, Kobayashi T, Sugimoto K. Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum. IEEE Trans Neural Networks Learn Syst. 2020;

المؤلفات المشابهة

Asmaa M. Salih Almohaidi, Fikrat M. Hassan, Hussin Rothan, الافتتاحية: التطورات الحالية في استراتيجيات مكافحة العدوى , مجلة بغداد للعلوم: مجلد 20 عدد 5 (2023): Issue 5
Sen-Yu Yang, Yin-Hong Xiang, Di-Wen Kang, Kai-Qing Zhou, خوارزمية بحث الوقواق المحسنة لزيادة نطاق التغطية لشبكات الاستشعار اللاسلكية , مجلة بغداد للعلوم: مجلد 21 عدد 2(SI) (2024): 2(Special Issue) ICAC2023/PARS2023
Muhammad S. Alam, Farhan B. Mohamed, AKM B. Hossain, تحديد الموقع الذاتي للروبوتات الموجهة من خلال تصنيف الصور , مجلة بغداد للعلوم: مجلد 21 عدد 2(SI) (2024): 2(Special Issue) ICAC2023/PARS2023
Zahraa Z. Aljanabi, Abdul-Hameed M. Jawad Al-Obaidy, Fikrat M. Hassan, موديل نوعية المياه الجديد للمياه السطحية العراقية , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6
Haider Mohammed Abdulhadi, Yousra Abdul Alsahib S. Aldeen, Maryam A. Yousif, Mays jalal jaseem, Syed Hamid Hussain Madni, تقنية الحوسبة السحابية على الشبكات اللاسلكية المخصصة المستخدمة في المدن الذكية , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6
Miftahus Sholihin, Mohd Farhan Md Fudzee, Mohd Norasri Ismail, استخراج الميزات المستندة إلى AlexNet لتصنيف الكسافا: نهج التعلم الآلي , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6
Ivan V. Stepanyan, Safa A. Hameed, نموذج عصبي محسّن للتعرف على البيانات الحركية ثلاثية الأبعاد للإنسان المستخرجة من نظام فايكن روبوت , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6
Suranjana Mitra, Annwesha Banerjee Majumder, Tanusree Saha, ملاحظة وتحليل دور الشبكة العصبية التلافيفية في التنبؤ بسرطان الرئة , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6
Maad M. Mijwil, Mohammad Aljanabi, تحليل مقارن لخوارزميات التعلم الآلي لتصنيف مرض السكري باستخدام تحليل مصفوفة الارتباك , مجلة بغداد للعلوم: مجلد 21 عدد 5 (2024): Issue 5
Ayham Darwich, Ebrahim Ismaiel, Ayman Al-kayal, Mujtaba Ali, Mohamed Masri, Hasan Mhd Nazha, كشف تشوهات القدم المختلفة باستخدام حساسات الضغط الأومية وتصنيفها سكونيا من خلال الشبكات العصبونية , مجلة بغداد للعلوم: مجلد 20 عدد 6(Suppl.) (2023): Supplement Issue 6

يمكنك أيضاً إبدأ بحثاً متقدماً عن المشابهات لهذا المؤلَّف.

CS-IF

1.3

CiteScore

0.6

Impact Factor

إنشاء طلب نشر

issn

P-ISSN: 2078-8665 | E-ISSN: 2411-7986

journalindexing

Journal Indexing
SCOPUS
Directory of Open Access Journals DOAJ
Library of Congress
Iraqi Academic Scientific Journal
Open Access Scholarly Publishers Association (OASPA)
SNIP (Source Normalized Impact Per Paper)

journalinfo

Journal Info
Journal: Baghdad Science Journal
Publisher: College of Science for Women/ University of Baghdad
Baghdad Sci. J. is peer-reviewed and open access
Print ISSN: 2078-8665
Electronic ISSN: 2411-7986
Publishing Frequency: Quarterly (from 2004 - 2021) Bi-monthly (from 2022) Monthly (from 2024)
Launched Date: 2004
Abbreviation: Baghdad Sci.J.
Each published paper in Baghdad Sci. J. has a digital object identifier (DOI) number

اللغة

scopus

1.3

2022CiteScore

50th percentile

ca

cope

sjr

locongress

clockss

Ithenticate

Sherpa Romeo

crossref

WHO

sci journal

uob digital repository

Scilit

cc

© 2022 The Author(s). Published by College of Science for Women, University of Baghdad. This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

الشريط الجانبي للمقالة

محتوى المقالة الرئيسي

الملخص

تفاصيل المقالة

كيفية الاقتباس

المراجع

المؤلفات المشابهة