
Hello! I am a researcher at Imbue where I work on building machine learning agents and software that help humans code.
Previously, I completed my PhD at the Australian National University, advised by Hanna Kurniawati. My thesis focused on building practical agents for partially observable, multi-agent environments by leveraging the combination of planning and reinforcement learning.
When I'm not writing code or running experiments, I spend my time rock climbing and being outside in nature.
Selected Publications (see all)
Towards Scalable Planning in Partially Observable, Multi-Agent Environments
Jonathon Schwartz
PhD Thesis (2025)
POSGGym: A Library for Decision-Theoretic Planning and Learning in Partially Observable, Multi-Agent Environments
Jonathon Schwartz, Rhys Newbury, Dana Kulić, Hanna Kurniawati
Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) (2025)
Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments
Jonathon Schwartz, Hanna Kurniawati, Marcus Hutter
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2023)
Online Planning for Interactive-POMDPs using Nested Monte Carlo Tree Search
Jonathon Schwartz, Ruijia Zhou, Hanna Kurniawati
International Conference on Intelligent Robots and Systems (IROS) (2022)
POMDP+ Information-Decay: Incorporating Defender's Behaviour in Autonomous Penetration Testing
Jonathon Schwartz, Hanna Kurniawati, Edwin El-Mahassni
International Conference on Automated Planning and Scheduling (ICAPS) (2020)
CybORG: An Autonomous Cyber Operations Research Gym
Callum Baillie, Maxwell Standen, Jonathon Schwartz, Michael Docking, David Bowman, Junae Kim
arXiv preprint (2020)
Open Source Projects
POSGGym A collection of environments and reference agents for planning and reinforcement learning research in partially observable, multi-agent environments. Related to this is also POSGGym-Baselines which contains baseline implementations of planning and reinforcement learning algorithms for POSGGym environments.
Network Attack Simulator Reinforcement learning environment for training autonomous network penetration testing agents. Simulates attack scenarios involving different network topologies vulnerabilities, scans, and exploits.
miniDRL Minimal implementations of distributed, recurrent, deep reinforcement learning algorithms (PPO, R2D2). Distributed RL, especially recurrent RL, gets pretty complex fast, this project contains some easy-to-follow stand-alone implementations of some distributed RL algorithms.
Blog
July 13, 2025