Jonathon Schwartz

Hello! I am a researcher at Imbue where I work on building machine learning agents and software that help humans code.

Previously, I completed my PhD at the Australian National University, advised by Hanna Kurniawati. My thesis focused on building practical agents for partially observable, multi-agent environments by leveraging the combination of planning and reinforcement learning.

When I'm not writing code or running experiments, I spend my time rock climbing and being outside in nature.

Selected Publications (see all)

Towards Scalable Planning in Partially Observable, Multi-Agent Environments

Jonathon Schwartz

PhD Thesis (2025)

thesis

POSGGym: A Library for Decision-Theoretic Planning and Learning in Partially Observable, Multi-Agent Environments

Jonathon Schwartz, Rhys Newbury, Dana Kulić, Hanna Kurniawati

Journal of Autonomous Agents and Multi-Agent Systems (JAAMAS) (2025)

paper | workshop paper (ICAPS PRL workshop '24) | code

Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments

Jonathon Schwartz, Hanna Kurniawati, Marcus Hutter

International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2023)

paper | extended abstract version | code

Online Planning for Interactive-POMDPs using Nested Monte Carlo Tree Search

Jonathon Schwartz, Ruijia Zhou, Hanna Kurniawati

International Conference on Intelligent Robots and Systems (IROS) (2022)

paper | code

POMDP+ Information-Decay: Incorporating Defender's Behaviour in Autonomous Penetration Testing

Jonathon Schwartz, Hanna Kurniawati, Edwin El-Mahassni

International Conference on Automated Planning and Scheduling (ICAPS) (2020)

paper

CybORG: An Autonomous Cyber Operations Research Gym

Callum Baillie, Maxwell Standen, Jonathon Schwartz, Michael Docking, David Bowman, Junae Kim

arXiv preprint (2020)

paper | extension (IJCAI '21) | code

Autonomous Penetration Testing using Reinforcement Learning

Jonathon Schwartz

Undergraduate Thesis (2019)

thesis | code

Open Source Projects

POSGGym A collection of environments and reference agents for planning and reinforcement learning research in partially observable, multi-agent environments. Related to this is also POSGGym-Baselines which contains baseline implementations of planning and reinforcement learning algorithms for POSGGym environments.

Network Attack Simulator Reinforcement learning environment for training autonomous network penetration testing agents. Simulates attack scenarios involving different network topologies vulnerabilities, scans, and exploits.

miniDRL Minimal implementations of distributed, recurrent, deep reinforcement learning algorithms (PPO, R2D2). Distributed RL, especially recurrent RL, gets pretty complex fast, this project contains some easy-to-follow stand-alone implementations of some distributed RL algorithms.

Blog

Training GPT2 from 650 to 3 million tokens per second

July 13, 2025