Generative Adversarial Network for Imitation Learning from Single Demonstration
Main Article Content
Abstract
Imitation learning is an effective method for training an autonomous agent to accomplish a task by imitating expert behaviors in their demonstrations. However, traditional imitation learning methods require a large number of expert demonstrations in order to learn a complex behavior. Such a disadvantage has limited the potential of imitation learning in complex tasks where the expert demonstrations are not sufficient. In order to address the problem, we propose a Generative Adversarial Network-based model which is designed to learn optimal policies using only a single demonstration. The proposed model is evaluated on two simulated tasks in comparison with other methods. The results show that our proposed model is capable of completing considered tasks despite the limitation in the number of expert demonstrations, which clearly indicate the potential of our model.
Received 15/10/2021
Accepted 14/11/2021
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
References
Hussein A, Gaber MM, Elyan E, Jayne C. Imitation learning: A survey of learning methods [Internet]. Vol. 50, ACM Computing Surveys. Association for Computing Machinery; 2017 [cited 2021 May 23]. Available from: https://dl.acm.org/doi/abs/10.1145/3054912
Pan Y, Cheng CA, Saigol K, Lee K, Yan X, Theodorou EA, et al. Imitation learning for agile autonomous driving. Int J Rob Res. 2020 Oct 14;39(2–3):286–302.
Xu Z, Sun Y, Liu M. ICurb: Imitation learning-based detection of road curbs using aerial images for autonomous driving. IEEE Robot Autom Lett. 2021 Apr 1;6(2):1097–104.
Kebria PM, Khosravi A, Salaken SM, Nahavandi S. Deep imitation learning for autonomous vehicles based on convolutional neural networks. IEEE/CAA J Autom Sin. 2020 Jan 1;7(1):82–95.
Doering M, Glas DF, Ishiguro H. Modeling interaction structure for robot imitation learning of human social behavior. IEEE Trans Human-Machine Syst. 2019 Jun 1;49(3):219–31.
Al-Tameemi MI. RMSRS: Rover Multi-purpose Surveillance Robotic System. Baghdad Sci J. 2020 Sep 8;17(3(Suppl.)):1049–1049.
Salimans T, Chen R. Learning Montezuma’s Revenge from a Single Demonstration. 2018 Dec 8 [cited 2021 Jun 14]; Available from: http://arxiv.org/abs/1812.03381
Cai P, Sun Y, Chen Y, Liu M. Vision-Based Trajectory Planning via Imitation Learning for Autonomous Vehicles. In: 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019. Institute of Electrical and Electronics Engineers Inc.; 2019. p. 2736–42.
Ly AO, Akhloufi M. Learning to Drive by Imitation: An Overview of Deep Behavior Cloning Methods. IEEE Trans Intell Veh. 2021 Jun 1;6(2):195–209.
Fernando T, Denman S, Sridharan S, Fookes C. Deep Inverse Reinforcement Learning for Behavior Prediction in Autonomous Driving: Accurate Forecasts of Vehicle Motion. IEEE Signal Process Mag. 2021 Jan 1;38(1):87–96.
Wang Z, Hong T. Reinforcement learning for building controls: The opportunities and challenges. Appl Energy. 2020 Jul 1;269:115036.
Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: A brief survey. IEEE Signal Process Mag. 2017 Nov 1;34(6):26–38.
Pakzad AE, Manuel RM, Uy JS, Asuncion XF, Ligayo JV, Materum L. Reinforcement Learning-Based Television White Space Database. Baghdad Sci J. 2021 Jun 20;18(2(Suppl.)):0947–0947.
Ho J, Ermon S. Generative Adversarial Imitation Learning. In: Advances in Neural Information Processing Systems. Curran Associates, Inc.; 2016.
Zuo G, Chen K, Lu J, Huang X. Deterministic generative adversarial imitation learning. Neurocomputing. 2020 May 7;388:60–9.
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Networks. Commun ACM. 2020 Oct 22;63(11):139–44.
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, et al. OpenAI Gym. 2016 Jun 5 [cited 2021 Jun 14]; Available from: http://arxiv.org/abs/1606.01540
Barto AG, Sutton RS, Anderson CW. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Trans Syst Man Cybern. 1983;SMC-13(5):834–46.
Liu S, Feng Y, Wu K, Cheng G, Huang J, Liu Z. Graph-Attention-Based Casual Discovery With Trust Region-Navigated Clipping Policy Optimization. IEEE Trans Cybern. 2021 Oct 20;1–14.
Ilboudo WEL, Kobayashi T, Sugimoto K. Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum. IEEE Trans Neural Networks Learn Syst. 2020;