University of Exeter
Browse

Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids

Download (2.55 MB)
journal contribution
posted on 2025-08-02, 11:07 authored by DK Panda, O Turner, S Das, M Abusara
Reinforcement Learning (RL) provides a pathway for efficiently utilizing the battery storage in a microgrid. However, traditional value-based RL algorithms used in battery management focus on formulating the policies based on the reward expectation rather than its probability distribution. Hence the scheduling strategy is solely based on the expectation of the rewards rather than the distribution. This paper focuses on scheduling strategy based on probability distribution of the rewards which optimally reflects the uncertainties in the incoming dataset. Furthermore, the prioritized experience replay samples of the training experience are used to enhance the quality of the learning by reducing bias. The results are obtained with different variants of distributional RL algorithms like C51, Quantile Regression Deep Q-Network (QR-DQN), Fully Quantizable Function (FQF), Implicit Quantile Networks (IQN) and rainbow. Moreover, the results are compared with the traditional deep Q-learning algorithm with prioritized experienced replay. The convergence results on the training dataset are further analyzed by varying the action spaces, using randomized experience replay and without including the tariff-based action while enforcing the penalties for violating battery SoC limits. The best trained Q-network is tested with different load and PV profiles to obtain the battery operation and costs. The performance of the distributional RL algorithms is analyzed under different schemes of Time of Use (ToU) tariff. QR-DQN with prioritized experience replay has been found to be the best performing algorithm in terms of convergence on the training dataset, with least fluctuation in validation dataset and battery operations during different tariff regimes during the day.

Funding

05R16P00282

05R18P02820

European Regional Development Fund

History

Related Materials

Rights

© 2023. Open access under the Creative Commons Attribution 4.0 International licence: https://creativecommons.org/licenses/by/4.0/

Notes

This is the author accepted manuscript. The final version is available on open access from Elsevier via the DOI in this record Data availability: Data will be made available on request.

Journal

Journal of Cleaner Production

Pagination

139947-139947

Publisher

Elsevier

Version

  • Accepted Manuscript

Language

en

FCD date

2023-12-04T11:18:15Z

FOA date

2023-12-04T11:21:37Z

Citation

Article 139947

Department

  • Earth and Environmental Sciences

Usage metrics

    University of Exeter

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC