Hack Monty 2: Win $10,000 breaking Pydantic's Python Sandbox

Round 1 of our Pydantic Monty bounty lasted less than 48 hours before someone broke out. We've patched the bug, re-audited the unsafe code around it, and we'd like you to have another go. This time for twice the money, and not on our own.

What happened in round 1

If you missed it: the first Hack Monty ran in April 2026 with a $5,000 bounty. The numbers were quite good:

~1.5 million POST requests to the Hack Monty API
~650 unique IP addresses
65 submissions to the bounty form

Two people earned bounties:

Owen Kwan from Veria Labs took the full $5,000 with a remote code execution that exfiltrated the SECRET environment variable.
Stanislav Fort from AISLE found a weakness in Hack Monty's virtual filesystem - not enough to read the secret, but enough that we paid out a supplementary $300.

The RCE was a use-after-free triggered via list.sort and a missing GC root. If you want the gory details, David Hewitt wrote a full postmortem - it's worth reading before you start round 2.

What's changed for round 2

Since round 1 we've:

Patched the underlying vulnerability in Monty v0.0.16 by extending the GC's root set to cover every object the relevant unsafe code depended on.
Re-audited the (small) set of unsafe blocks in Monty.
Hardened Hack Monty's virtual filesystem against the patterns Stanislav reported.

We're not claiming Monty is now bug-free. We expect minor logical bugs to keep turning up, and one and a half million requests doesn't exhaustively cover the attack surface. But the only way to find out what we missed is to put Monty back in front of you.

Colab

We're not the only team using Monty for safe AI code execution. For round 2 both Prefect and Hugging Face are sponsoring the bounty prize with us.

Prefect

Prefect makes FastMCP. We're co-sponsoring Hack Monty because FastMCP 3.1's CodeMode — which lets agents discover and chain tools in Python instead of burning context on full catalogs — runs on Pydantic's Monty by default. We chose Monty because it's lightweight and runs embedded: no extra infrastructure, microsecond startup. Embedded also means a sandbox escape is a host compromise. Funding half of round 2's bounty is the cheapest way to get serious researchers pressure-testing it.

Hugging Face

At Hugging Face we’re building the infrastructure for your agents. We use the fast, embeddable code execution sandbox from Pydantic Monty to simply - and safely - run model generated code in our MCP Server. We’re funding the Round 2 bounty to promote researching and solving security issues in the foundational components of the future.

What's Monty, again?

Short version: Monty is a minimal Python interpreter, written in Rust, that applications can use to run AI generated code right on the host. It starts from nothing and only exposes the functions you explicitly hand it - the opposite of the usual "container with a blacklist" approach. Startup is measured in microseconds, not seconds, and it can snapshot mid-execution so the host can resolve an external call and resume later.

If you want the longer version:

The repo and README: github.com/pydantic/monty
The original blog post: Pydantic Monty
A talk I gave in March about Monty at PyAI: Monty, from tool calling to computer use
The round 1 announcement: Hack Monty
The round 1 postmortem: Hack Monty - Postmortem

How to play

The honeypot still lives at hackmonty.com.

You can POST Python code to /run/ and it'll be run on the server with no sandbox beyond Monty itself.

curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -d '{"code": "print(1 + 1)"}'

You should try to read either /app/secret.txt or the SECRET environment variable.

The full request/response schemas live in the Swagger docs and Redoc docs.

Do it the easy way: the CLI

Doing the pause/resume dance by hand is tedious. The same small CLI from round 1 takes a Python file (plus an optional "globals" file that supplies the external functions the sandbox will ask for) and drives the whole loop for you:

hackmonty CLI gist

uv run hackmonty.py --script my_attack.py --globals my_helpers.py

It distinguishes syntax vs. type vs. runtime errors, handles async calls, and prints each external call and name lookup it resolves - much nicer than hand-rolling curl loops.

Request secret

To let us confirm it was really you who found the secret, you may want to include a User header on your requests. The header value should be the SHA-256 hex digest of some unique secret only you know (a random string, passphrase, UUID, anything you like). Keep the plaintext to yourself until you report the find; we'll check that its SHA-256 matches a User header we recorded from the winning run and that nobody else beat you to it.

USER_HASH=$(printf 'my-secret-passphrase' | shasum -a 256 | awk '{print $1}')
curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -H "User: $USER_HASH" \
  -d '{"code": "print(1 + 1)"}'

The CLI accepts a --user-secret flag that takes your plaintext secret and sends its SHA-256 as the header on every request:

uv run hackmonty.py --user-secret 'my-secret-passphrase' --code 'print(1 + 1)'

Watch what's happening server-side

We've kept the Logfire project behind hackmonty.com open so you can see traces of your own requests (and everyone else's) as they flow through the server: join the Logfire project.

NOTE: this means if you make requests to hackmonty.com/run/* all the data you submit, as well as your IP address and any other headers are visible to anyone looking at the traces.

Note: the server code for hackmonty.com itself is closed source - the bounty is about Monty, not our FastAPI wrapper. Logfire gives you enough visibility to understand what the sandbox is doing without needing the source.

Rules

Read the rules BEFORE you start.

If you do this, or run agents that try to do this, we will block you and report you as a malicious actor. If we find that a PR has been merged anywhere in the dependency tree that introduces a vulnerability related to this bounty, we'll stop the bounty program.

Who we can pay

We'd love contributions from any developer, anywhere - but we can only pay the bounty if you have a bank account in a country approved by GitHub Sponsors, and our bank (Mercury) is able to make a transfer to it. If you find something, and we can't legally pay you, we'll still credit you publicly and sort out some swag. But please check the list before you spend a week on this expecting a cheque.

(To be clear, we're not using GitHub sponsors as a means of payment, we're just piggybacking off Microsoft's checks on what countries are legal to make payments to for a US company.)

Full bounty ($10,000 USD)

We will pay you for finding the file or environment variable secret by exploiting a security flaw or vulnerability in Pydantic Monty. You will need to show us the code and technique you used, as well as the secret you extracted.
The full bounty is jointly funded by Pydantic, Prefect, and Hugging Face. You get one payment from us; we sort out the rest with Prefect and Hugging Face.

Partial bounty (amount at our discretion)

A security flaw in this app - a mistake in our server config or code that lets you read the secret.
A security flaw or vulnerability elsewhere in the dependency tree (Pydantic Validation, Starlette, Uvicorn, PyO3, and so on) that lets you read the secret.
A security flaw in Pydantic Logfire where it instruments this app - doesn't have to leak the secret; if you find a vulnerability or access to info that shouldn't be visible, tell us.
A vulnerability in Pydantic Monty that gives you access to or control of the host, but doesn't let you read the secrets (Rust tracebacks, OS details, binary path, network access, reading or writing files you shouldn't have access to, and so on).

No bounty, but we'd still love to hear about it

Crashing Monty with malicious code - panic, stack overflow, segfault, unbounded memory or CPU allocation. Report the issue with the code you used; we'll buy you a drink or a t-shirt if we run into you at a conference. Not part of the bounty program for now.
Bugs or CPython compatibility issues in Monty - please open an issue; not part of the bounty program.
Bugs or vulnerabilities elsewhere in the dependency tree - check whether the issue is new and file it with the upstream project; also not part of this program.

Please don't do any of these

Finding the secret or a vulnerability by changing the code in any library - see the first rule.
"Spear phishing" the Pydantic, Prefect or Hugging Face teams or any similar social engineering attack.
Finding a security flaw in Render, where this app is deployed - report that to Render directly.
DoS'ing the app, or otherwise making it unresponsive.
DoS'ing or otherwise causing a service interruption in any other Pydantic, Prefect, or Hugging Face service.

Reporting a find

If you've found a vulnerability that you think is worth paying the bounty for, please submit it in this form.

If you want to talk to us, join our community in Slack and send us a message in the #monty channel.

Have fun. Break it harder.

Hack Monty 2: a $10,000 bounty to break our Python sandbox

What happened in round 1

What's changed for round 2

Colab

Prefect

Hugging Face

What's Monty, again?

How to play

Do it the easy way: the CLI

Request secret

Watch what's happening server-side

Rules

Who we can pay

Full bounty ($10,000 USD)

Partial bounty (amount at our discretion)

No bounty, but we'd still love to hear about it

Please don't do any of these

Reporting a find

Ready to see what your agents are actually doing?

Related content

Hack Monty - Postmortem

Hack Monty: a $5,000 bounty to break our Python sandbox

Explore Logfire

Hack Monty 2: a $10,000 bounty to break our Python sandbox

#What happened in round 1

#What's changed for round 2

#Colab

#Prefect

#Hugging Face

#What's Monty, again?

#How to play

#Do it the easy way: the CLI

#Request secret

#Watch what's happening server-side

#Rules

#Who we can pay

#Full bounty ($10,000 USD)

#Partial bounty (amount at our discretion)

#No bounty, but we'd still love to hear about it

#Please don't do any of these

#Reporting a find

Ready to see what your agents are actually doing?

Related content

Hack Monty - Postmortem

Hack Monty: a $5,000 bounty to break our Python sandbox

Explore Logfire

What happened in round 1

What's changed for round 2

Colab

Prefect

Hugging Face

What's Monty, again?

How to play

Do it the easy way: the CLI

Request secret

Watch what's happening server-side

Rules

Who we can pay

Full bounty ($10,000 USD)

Partial bounty (amount at our discretion)

No bounty, but we'd still love to hear about it

Please don't do any of these

Reporting a find