Dynamic Cache Replacement in Edge Computing via Offline Deep Reinforcement Learning

Wang, Z

Abstract

The ever-increasing data traffic generated by the widespread adoption of mobile devices and the proliferation of emerging smart applications has placed a significant burden on backhaul networks, thus hindering the deployment of delay-sensitive services for end users. As one of the core technologies of mobile edge computing (MEC), edge ...

The ever-increasing data traffic generated by the widespread adoption of mobile devices and the proliferation of emerging smart applications has placed a significant burden on backhaul networks, thus hindering the deployment of delay-sensitive services for end users. As one of the core technologies of mobile edge computing (MEC), edge caching is a promising way to tackle this issue by equipping edge servers close to users with storage resources to avoid retrieving duplicate contents from the remote cloud and providing services nearby. One fundamental problem of edge caching in MEC networks is how to replace contents cached in the edge servers with limited storage capacity to meet the dynamic requirements of users without knowing their preferences in advance. Recently, online deep reinforcement learning (DRL)-based methods have been developed to address this problem by learning an edge cache replacement policy using samples collected from continuous interactions (trial-and-error) with the environment. However, in practice, the online data collection phase is often expensive and time-consuming, thus hindering the practical deployment of online DRL-based methods. Offline DRL can cope with the shortcomings of online DRL by only interacting with a previously collected static dataset to learn the policy without additional online data collection during training. The thesis aims to develop offline DRL-based methods to handle the cache replacement problem in MEC networks to advance the application of DRL in the real world. First, this thesis proposes a two-stage offline-online DRL-based method consisting of offline training and online tuning to solve the reactive cache replacement problem in the MEC network with a single server for the maximal cache hit ratio. Experimental results show that the policy trained offline from a static dataset outperforms the behavior policy used to sample the dataset, and is further improved through online tuning to reach the performance of the policy trained by an online DRL-based method with smaller online samples. The thesis then extends the single server-aided edge caching environment to a cooperative scenario that is more complex and realistic. A centralized offline DRL-based method is developed to coordinate multiple servers to optimize service delay effectively. It combines with a multi-head convolutional neural network (CNN) to realize parallel information processing for the servers for high training and inference efficiency. Considering that the centralized offline-DRL method makes it hard to handle large state and action spaces that expand exponentially with the increase in the number of servers, the thesis designs a proactive offline multi-agent DRL (MADRL)-based cache replacement method to jointly optimize the service delay of users and the energy consumption of servers. Experimental results demonstrate that the offline MADRL-based method can successfully learn a better offline policy from a static dataset in the large-scale MEC network.

Dynamic Cache Replacement in Edge Computing via Offline Deep Reinforcement Learning

Doctoral Theses

Doctoral College