Paper Abstract and Keywords |
Presentation |
2015-06-23 16:35
Inverse reinforcemnet learing based on behaviors of a learning agent Shunsuke Sakurai, Shigeyuki Oba, Shin Ishii (Kyoto Univ.) IBISML2015-15 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
An appropriate design of reward function is important for reinforcement learning to efficiently obtain an optimal policy to achieve an intended goal
because different reward functions for the same goal can cause different convergence speed of learning.
However, there is no systematic way to determine a good reward function for any environments.
How can we imitate a training strategy of a reference agent who efficiently adapts occasional changes of environment?
In this study, we extend the apprenticeship learning framework to accept state-action-history data of developing agent whose policy is not optimal but changing toward optimal.
By this extension, reward function is estimated by inverse reinforcement learning using the estimated change of policy of a developing reference agent,
and the objective agent can imitate the policy learning process of the reference agent using the estimated reward function.
We applied the proposed method to estimate reward function of a developing agent that trained at a simple 2-state Markov decision process (MDP) and showed that the process to determining optimal policy is imitated by the reward that was estimated by the proposed method. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
reinforcement learning / inverse reinforcement learning / apprenticeship learning / learning process / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 115, no. 112, IBISML2015-15, pp. 95-99, June 2015. |
Paper # |
IBISML2015-15 |
Date of Issue |
2015-06-16 (IBISML) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
IBISML2015-15 |
Conference Information |
Committee |
NC IPSJ-BIO IBISML IPSJ-MPS |
Conference Date |
2015-06-23 - 2015-06-25 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Okinawa Institute of Science and Technology |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
Machine Learning Approach to Biodata Mining, and General |
Paper Information |
Registration To |
IBISML |
Conference Code |
2015-06-NC-BIO-IBISML-MPS |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Inverse reinforcemnet learing based on behaviors of a learning agent |
Sub Title (in English) |
|
Keyword(1) |
reinforcement learning |
Keyword(2) |
inverse reinforcement learning |
Keyword(3) |
apprenticeship learning |
Keyword(4) |
learning process |
Keyword(5) |
|
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Shunsuke Sakurai |
1st Author's Affiliation |
Kyoto University (Kyoto Univ.) |
2nd Author's Name |
Shigeyuki Oba |
2nd Author's Affiliation |
Kyoto University (Kyoto Univ.) |
3rd Author's Name |
Shin Ishii |
3rd Author's Affiliation |
Kyoto University (Kyoto Univ.) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2015-06-23 16:35:00 |
Presentation Time |
25 minutes |
Registration for |
IBISML |
Paper # |
IBISML2015-15 |
Volume (vol) |
vol.115 |
Number (no) |
no.112 |
Page |
pp.95-99 |
#Pages |
5 |
Date of Issue |
2015-06-16 (IBISML) |
|