Skip to main navigation Skip to search Skip to main content

Offline Reinforcement Learning for Price-Based Demand Response Program Design

  • Stony Brook University
  • Amazon.com, Inc.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In this paper, price-based demand response (DR) program design by offline Reinforcement Learning (RL) with data collected from smart meters is studied. Unlike online RL approaches, offline RL does not need to interact with consumers in the real world and thus has great cost-effectiveness and safety advantages. A sequential decision-making process with a Markov Decision Process (MDP) framework is formulated. A novel data augmentation method based on bootstrapping is developed. Deep Q-network (DQN)-based offline RL and policy evaluation algorithms are developed to design high-performance DR pricing policies. The developed offline learning methods are evaluated on both a real-world data set and simulation environments. It is demonstrated that the performance of the developed offline RL methods achieve excellent performance that is very close to the ideal performance bound provided by the state-of-the-art online RL algorithms.

Original languageEnglish
Title of host publication2023 57th Annual Conference on Information Sciences and Systems, CISS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665451819
DOIs
StatePublished - 2023
Event57th Annual Conference on Information Sciences and Systems, CISS 2023 - Baltimore, United States
Duration: Mar 22 2023Mar 24 2023

Publication series

Name2023 57th Annual Conference on Information Sciences and Systems, CISS 2023

Conference

Conference57th Annual Conference on Information Sciences and Systems, CISS 2023
Country/TerritoryUnited States
CityBaltimore
Period03/22/2303/24/23

Fingerprint

Dive into the research topics of 'Offline Reinforcement Learning for Price-Based Demand Response Program Design'. Together they form a unique fingerprint.

Cite this