Abstract
© 2016 IEEE. In this paper we present a new concept of self-reflection learning to support a deep reinforcement learning model. The self-reflective process occurs offline between episodes to help the agent to learn to navigate towards a goal location and boost its online performance. In particular, a so far optimal experience is recalled and compared with other similar but suboptimal episodes to reemphasize worthy decisions and deemphasize unworthy ones using eligibility and learning traces. At the same time, relatively bad experience is forgotten to remove its confusing effect. We set up a layer-wise deep actor-critic architecture and apply the self-reflection process to help to train it. We show that the self-reflective model seems to work well and initial experimental result on real robot shows that the agent accomplished good success rate in reaching a goal location.
More Information
Identification Number: | https://doi.org/10.1109/IJCNN.2016.7727798 |
---|---|
Status: | Published |
Refereed: | Yes |
Publisher: | IEEE |
Additional Information: | Electronic ISBN: 978-1-5090-0620-5 USB ISBN: 978-1-5090-0619-9 Print on Demand(PoD) ISBN: 978-1-5090-0621-2 |
Depositing User (symplectic) | Deposited by Altahhan, Abdulrahman |
Date Deposited: | 12 Dec 2017 17:13 |
Last Modified: | 11 Jul 2024 12:04 |
Item Type: | Article |