Memory Gym Experiments

Memory Gym: Towards Endless Tasks to
Benchmark Memory Capabilities of Agents

Memory Gym features the environments Mortar Mayhem, Mystery Path, and Searing Spotlights. These environments benchmark an agent's memory to

memorize events across long sequences,
generalize,
and be robust to noise.

Especially, these environments feature endless task variants. As the agent's policy improves, the task goes on. The traveling game "I packed my bag ..." inspired this dynamic concept, which allows for examining levels of effectiveness instead of just sample efficiency.

@article{JMLR:v26:24-0043, title={Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents}, author={Marco Pleines and Matthias Pallasch and Frank Zimmer and Mike Preuss}, journal={Journal of Machine Learning Research}, year={2025}, volume={26}, number={6}, pages={1--40}, url={https://www.jmlr.org/papers/v26/24-0043.html} }

Interactive Result Videos of Trained Agent Behaviors

For each environment, we cherry picked GRU and TrXL agent behaviors for visualization.
Three episodes, related to level seeds not seen during training, were rendered for every chosen agent.
Clicking on distinct data points in line plots or attention scores will take you to the corresponding episode step.
Each of the upcoming result pages features various visualization tools.