Imitation Learning using Generalized Sliced Wasserstein Distances
Jun 1, 2024··
0 min read
Ivan Ovinnikov
Abstract
Imitation learning methods enable one to train reinforcement-learning-style policies by way of mimicking the state occupancies of a given expert agent. Most approaches are divergence-based, which can result in optimization objectives that, empirically, are brittle and difficult-to-solve. As an alternative, we explore an approach based on the sliced Wasserstein distance, with the hope of using its optimal-transport-based formulation and favorable computational properties to improve performance. To do so, we formulate a per-state reward function based on the approximate differential of the sliced Wasserstein distance, which allows one to apply standard forward reinforcement learning methods to solve the imitation learning policy optimization problem. We demonstrate that the proposed method exhibits improved performance compared to established imitation learning frameworks on a number of benchmark tasks from the MuJoCo robotic locomotion suite.
Type