In this article, reinforcement learning is used to obtain optimal reactive control
of a two-body point absorber. In particular, the Q-learning algorithm
is adopted for the maximization of the energy extraction in each sea state.
The controller damping and sti ness coe cients are varied in steps, observing
the associated reward, ...
In this article, reinforcement learning is used to obtain optimal reactive control
of a two-body point absorber. In particular, the Q-learning algorithm
is adopted for the maximization of the energy extraction in each sea state.
The controller damping and sti ness coe cients are varied in steps, observing
the associated reward, which corresponds to an increase in the absorbed
power, or penalty, owing to large displacements. The generated power is
averaged over a time horizon spanning several wave cycles due to the periodicity
of ocean waves, discarding the transient e ects at the start of each
new episode. The model of a two-body point absorber is developed in order
to validate the control strategy in both regular and irregular waves. In all
analysed sea states, the controller learns the optimal damping and sti ness
coe cients. Furthermore, the scheme is independent of internal models of
the device response, which means that it can adapt to variations in the unit
dynamics with time and does not present modelling errors.