Towards Transferable Deep Reinforcement Learning Policies with high Post-Deployment Performance in Smart Grids

Cristian Sebastian Cubides-Herrera , Lana Amaya and Markus Duchon

2025 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe),

October 2025 · doi: 10.1109/ISGTEurope64741.2025.11305410

abstract

Reinforcement learning controllers trained in ideal smart-grid simulators fail to operate on real networks due to mismatches in underlying operational dynamics, e.g. battery efficiencies are often neglected or some charge sensors introduce random noise and biases. Our proposed two-layer transfer learning framework bridges this gap without requiring prior knowledge of the physical system (model-agnostic approach). We use an on-the-run trainable domain classifier, which reshapes the reward function to discourage actions that lead to unlikely situations in the target domain while training, and thus, steering learning towards optimal behaviors valid for the real domain. In parallel, a recurrent denoising autoencoder, trained with offline data extracted from scheduled operation, eliminates systematic and stochastic errors from the observation space before control decisions are made. The method was tested on a microgrid system, composed by a photovoltaic module, an electrical load and a battery, under three different efficiencies conditions and 10 random seeds. Our combined method achieved energy cost savings of up to 2.6% when the real efficiency is 80%, 17.5% for efficiency 65% and 10.2% for 50%, while maintaining safety voltage limits in the system. The method shows reliable simulation-to-reality transfer for energy networks proved by statistical significance tests. The overall transfer learning approach was designed to be extendable to grids with more nodes and different optimization objectives.

url: https://ieeexplore.ieee.org/document/11305410