Memory Gym: Towards Endless Tasks to
Benchmark Memory Capabilities of Agents

Memory Gym features the environments Mortar Mayhem, Mystery Path, and Searing Spotlights. These environments benchmark an agent's memory to

Especially, these environments feature endless task variants. As the agent's policy improves, the task goes on. The traveling game "I packed my bag ..." inspired this dynamic concept, which allows for examining levels of effectiveness instead of just sample efficiency.

Github

Citation

@inproceedings{pleines2023memory,
     title={Memory Gym: Partially Observable Challenges to Memory-Based Agents},
     author={Marco Pleines and Matthias Pallasch and Frank Zimmer and Mike Preuss},
     booktitle={International Conference on Learning Representations},
     year={2023},
     url={https://openreview.net/forum?id=jHc8dCx6DDr}
}

Interactive Result Videos of Trained Agent Behaviors

For each environment, we cherry picked GRU and TrXL agent behaviors for visualization.
Three episodes, related to level seeds not seen during training, were rendered for every chosen agent.
Clicking on distinct data points in line plots or attention scores will take you to the corresponding episode step.
Each of the upcoming result pages features various visualization tools.


Transformer-XL
+ Reconstruction


Gated Recurrent Unit


Gated Recurrent Unit
+ Reconstruction


Transformer-XL
+ Relative Positional Encoding

Transformer-XL
+ Ground Truth

Transformer-XL
+ QPos

Transformer-XL
+ Ground Truth
+ QPos

Gated Recurrent Unit


Transformer-XL
+ Reconstruction

Gated Recurrent Unit
+ Reconstruction


Transformer-XL
+ Reconstruction

Gated Recurrent Unit
+ Reconstruction