TY - GEN
T1 - Information-Directed Policy Search in Sparse-Reward Settings via the Occupancy Information Ratio
AU - Suttle, Wesley A.
AU - Koppel, Alec
AU - Liu, Ji
N1 - Publisher Copyright: © 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper examines a new measure of the exploration/exploitation trade-off in reinforcement learning (RL) called the occupancy information ratio (OIR). To this end, the paper derives the Information-Directed Actor-Critic (IDAC) algorithm for solving the OIR problem, provides an overview of the rich theory underlying IDAC and related OIR policy gradient methods, and experimentally investigates the advantages of such methods. The central contribution of this paper is to provide empirical evidence that, due to the form of the OIR objective, IDAC enjoys superior performance over vanilla RL methods in sparse-reward environments.
AB - This paper examines a new measure of the exploration/exploitation trade-off in reinforcement learning (RL) called the occupancy information ratio (OIR). To this end, the paper derives the Information-Directed Actor-Critic (IDAC) algorithm for solving the OIR problem, provides an overview of the rich theory underlying IDAC and related OIR policy gradient methods, and experimentally investigates the advantages of such methods. The central contribution of this paper is to provide empirical evidence that, due to the form of the OIR objective, IDAC enjoys superior performance over vanilla RL methods in sparse-reward environments.
KW - exploration vs. exploitation
KW - reinforcement learning
KW - sparse rewards
UR - https://www.scopus.com/pages/publications/85154040212
U2 - 10.1109/CISS56502.2023.10089655
DO - 10.1109/CISS56502.2023.10089655
M3 - Conference contribution
T3 - 2023 57th Annual Conference on Information Sciences and Systems, CISS 2023
BT - 2023 57th Annual Conference on Information Sciences and Systems, CISS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 57th Annual Conference on Information Sciences and Systems, CISS 2023
Y2 - 22 March 2023 through 24 March 2023
ER -