Memory Gym: Partially Observable Challenges to
Memory-Based Agents in Endless Episodes

Memory Gym features the environments Mortar Mayhem, Mystery Path, and Searing Spotlights. These environments benchmark an agent's memory to

memorize events across long sequences,
generalize,
and be robust to noise.

Especially, these environments feature endless task variants. As the agent's policy improves, the task goes on. The traveling game "I packed my bag ..." inspired this dynamic concept, which allows for examining levels of effectiveness instead of just sample efficiency.

Github

Memory Gym neroRL (Deep Reinforcement Learning Framework) Recurrent Proximal Policy Optimization using Truncated BPTT Transformer-XL as Episodic Memory in Proximal Policy Optimization

Citation

@inproceedings{pleines2023memory,
     title={Memory Gym: Partially Observable Challenges to Memory-Based Agents},
     author={Marco Pleines and Matthias Pallasch and Frank Zimmer and Mike Preuss},
     booktitle={International Conference on Learning Representations},
     year={2023},
     url={https://openreview.net/forum?id=jHc8dCx6DDr}
}

Interactive Result Videos of Trained Agent Behaviors

For each environment, we cherry picked GRU and TrXL agent behaviors for visualization.
Three episodes, related to level seeds not seen during training, were rendered for every chosen agent.
Clicking on distinct data points in line plots or attention scores will take you to the corresponding episode step.
Each of the upcoming result pages features various visualization tools.