Reactive control of a two-body point absorber using reinforcement learning
© 2017 The Authors. Published by Elsevier Ltd. Open Access funded by Engineering and Physical Sciences Research Council. Under a Creative Commons license: https://creativecommons.org/licenses/by/4.0/
In this article, reinforcement learning is used to obtain optimal reactive control of a two-body point absorber. In particular, the Q-learning algorithm is adopted for the maximization of the energy extraction in each sea state. The controller damping and sti ness coe cients are varied in steps, observing the associated reward, which corresponds to an increase in the absorbed power, or penalty, owing to large displacements. The generated power is averaged over a time horizon spanning several wave cycles due to the periodicity of ocean waves, discarding the transient e ects at the start of each new episode. The model of a two-body point absorber is developed in order to validate the control strategy in both regular and irregular waves. In all analysed sea states, the controller learns the optimal damping and sti ness coe cients. Furthermore, the scheme is independent of internal models of the device response, which means that it can adapt to variations in the unit dynamics with time and does not present modelling errors.
The authors would like to thank the Energy Technologies Institute (ETI) and the Research Councils Energy Programme (RCEP) for funding this research as part of the IDCORE programme (EP/J500847/), as well as the Engineering and Physical Sciences Research Council (EPSRC) (grant EP/J500847/1). In addition, Mr. Anderlini would like to thank Wave Energy Scotland (WES) for sponsoring his Eng.D. research project. WES is taking an innovative approach to supporting the development of wave energy technology by managing the most extensive technology programme of its kind in the sector, concentrating on key areas which have been identi ed as having the most potential impact on the long-term levellised cost of energy and improved commercial viability.
This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record.
Published online 24 August 2017