The documentation of this library is very limited and it is hard to parse how it should be applied for different applications. For instance, in Many RL algorithms such as PPO and MPO, an entropy regularization terms or the KL constraints between policies employed to stabilise standard RL objectives. Is it possible to give a practical example of how one can use this library for a dual problem with lagrangian multipliers for such applications?
Many thanks.
The documentation of this library is very limited and it is hard to parse how it should be applied for different applications. For instance, in Many RL algorithms such as PPO and MPO, an entropy regularization terms or the KL constraints between policies employed to stabilise standard RL objectives. Is it possible to give a practical example of how one can use this library for a dual problem with lagrangian multipliers for such applications?
Many thanks.