/Pydantic Monty

Hack Monty 2: a $10,000 bounty to break our Python sandbox

4 mins

Round 1 of our Pydantic Monty bounty lasted less than 48 hours before someone broke out. We've patched the bug, re-audited the unsafe code around it, and we'd like you to have another go. This time for twice the money, and not on our own.

If you missed it: the first Hack Monty ran in April 2026 with a $5,000 bounty. The numbers were quite good:

  • ~1.5 million POST requests to the Hack Monty API
  • ~650 unique IP addresses
  • 65 submissions to the bounty form

Two people earned bounties:

  • Owen Kwan from Veria Labs took the full $5,000 with a remote code execution that exfiltrated the SECRET environment variable.
  • Stanislav Fort from AISLE found a weakness in Hack Monty's virtual filesystem - not enough to read the secret, but enough that we paid out a supplementary $300.

The RCE was a use-after-free triggered via list.sort and a missing GC root. If you want the gory details, David Hewitt wrote a full postmortem - it's worth reading before you start round 2.

Since round 1 we've:

  • Patched the underlying vulnerability in Monty v0.0.16 by extending the GC's root set to cover every object the relevant unsafe code depended on.
  • Re-audited the (small) set of unsafe blocks in Monty.
  • Hardened Hack Monty's virtual filesystem against the patterns Stanislav reported.

We're not claiming Monty is now bug-free. We expect minor logical bugs to keep turning up, and one and a half million requests doesn't exhaustively cover the attack surface. But the only way to find out what we missed is to put Monty back in front of you.

We're not the only team using Monty for safe AI code execution. For round 2 both Prefect and Hugging Face are sponsoring the bounty prize with us.

Prefect makes FastMCP. We're co-sponsoring Hack Monty because FastMCP 3.1's CodeMode — which lets agents discover and chain tools in Python instead of burning context on full catalogs — runs on Pydantic's Monty by default. We chose Monty because it's lightweight and runs embedded: no extra infrastructure, microsecond startup. Embedded also means a sandbox escape is a host compromise. Funding half of round 2's bounty is the cheapest way to get serious researchers pressure-testing it.

At Hugging Face we’re building the infrastructure for your agents. We use the fast, embeddable code execution sandbox from Pydantic Monty to simply - and safely - run model generated code in our MCP Server. We’re funding the Round 2 bounty to promote researching and solving security issues in the foundational components of the future.

Short version: Monty is a minimal Python interpreter, written in Rust, that applications can use to run AI generated code right on the host. It starts from nothing and only exposes the functions you explicitly hand it - the opposite of the usual "container with a blacklist" approach. Startup is measured in microseconds, not seconds, and it can snapshot mid-execution so the host can resolve an external call and resume later.

If you want the longer version:

The honeypot still lives at hackmonty.com.

You can POST Python code to /run/ and it'll be run on the server with no sandbox beyond Monty itself.

curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -d '{"code": "print(1 + 1)"}'

You should try to read either /app/secret.txt or the SECRET environment variable.

The full request/response schemas live in the Swagger docs and Redoc docs.

Doing the pause/resume dance by hand is tedious. The same small CLI from round 1 takes a Python file (plus an optional "globals" file that supplies the external functions the sandbox will ask for) and drives the whole loop for you:

hackmonty CLI gist

uv run hackmonty.py --script my_attack.py --globals my_helpers.py

It distinguishes syntax vs. type vs. runtime errors, handles async calls, and prints each external call and name lookup it resolves - much nicer than hand-rolling curl loops.

To let us confirm it was really you who found the secret, you may want to include a User header on your requests. The header value should be the SHA-256 hex digest of some unique secret only you know (a random string, passphrase, UUID, anything you like). Keep the plaintext to yourself until you report the find; we'll check that its SHA-256 matches a User header we recorded from the winning run and that nobody else beat you to it.

USER_HASH=$(printf 'my-secret-passphrase' | shasum -a 256 | awk '{print $1}')
curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -H "User: $USER_HASH" \
  -d '{"code": "print(1 + 1)"}'

The CLI accepts a --user-secret flag that takes your plaintext secret and sends its SHA-256 as the header on every request:

uv run hackmonty.py --user-secret 'my-secret-passphrase' --code 'print(1 + 1)'

We've kept the Logfire project behind hackmonty.com open so you can see traces of your own requests (and everyone else's) as they flow through the server: join the Logfire project.

NOTE: this means if you make requests to hackmonty.com/run/* all the data you submit, as well as your IP address and any other headers are visible to anyone looking at the traces.

Note: the server code for hackmonty.com itself is closed source - the bounty is about Monty, not our FastAPI wrapper. Logfire gives you enough visibility to understand what the sandbox is doing without needing the source.

Read the rules BEFORE you start.

If you do this, or run agents that try to do this, we will block you and report you as a malicious actor. If we find that a PR has been merged anywhere in the dependency tree that introduces a vulnerability related to this bounty, we'll stop the bounty program.

We'd love contributions from any developer, anywhere - but we can only pay the bounty if you have a bank account in a country approved by GitHub Sponsors, and our bank (Mercury) is able to make a transfer to it. If you find something, and we can't legally pay you, we'll still credit you publicly and sort out some swag. But please check the list before you spend a week on this expecting a cheque.

(To be clear, we're not using GitHub sponsors as a means of payment, we're just piggybacking off Microsoft's checks on what countries are legal to make payments to for a US company.)

  • We will pay you for finding the file or environment variable secret by exploiting a security flaw or vulnerability in Pydantic Monty. You will need to show us the code and technique you used, as well as the secret you extracted.
  • The full bounty is jointly funded by Pydantic, Prefect, and Hugging Face. You get one payment from us; we sort out the rest with Prefect and Hugging Face.
  • A security flaw in this app - a mistake in our server config or code that lets you read the secret.
  • A security flaw or vulnerability elsewhere in the dependency tree (Pydantic Validation, Starlette, Uvicorn, PyO3, and so on) that lets you read the secret.
  • A security flaw in Pydantic Logfire where it instruments this app - doesn't have to leak the secret; if you find a vulnerability or access to info that shouldn't be visible, tell us.
  • A vulnerability in Pydantic Monty that gives you access to or control of the host, but doesn't let you read the secrets (Rust tracebacks, OS details, binary path, network access, reading or writing files you shouldn't have access to, and so on).
  • Crashing Monty with malicious code - panic, stack overflow, segfault, unbounded memory or CPU allocation. Report the issue with the code you used; we'll buy you a drink or a t-shirt if we run into you at a conference. Not part of the bounty program for now.
  • Bugs or CPython compatibility issues in Monty - please open an issue; not part of the bounty program.
  • Bugs or vulnerabilities elsewhere in the dependency tree - check whether the issue is new and file it with the upstream project; also not part of this program.
  • Finding the secret or a vulnerability by changing the code in any library - see the first rule.
  • "Spear phishing" the Pydantic, Prefect or Hugging Face teams or any similar social engineering attack.
  • Finding a security flaw in Render, where this app is deployed - report that to Render directly.
  • DoS'ing the app, or otherwise making it unresponsive.
  • DoS'ing or otherwise causing a service interruption in any other Pydantic, Prefect, or Hugging Face service.

If you've found a vulnerability that you think is worth paying the bounty for, please submit it in this form.

If you want to talk to us, join our community in Slack and send us a message in the #monty channel.


Have fun. Break it harder.