Hack Monty: Win $5000 breaking Pydantic's Python Sandbox

We think pydantic-monty is pretty hard to break out of, but if we're to be proven wrong, we'd like it to happen before we drop the "experimental" badge - so we're putting money on it.

Update 1:

A number of people found a "vulnerability". It did not allow the secrets to be read, but did allow you to write files to a fake filesystem and read them back on another request. Someone wrote to the fake /app/secret.txt file path, then a bunch of people read that file and thought they found the secret.

The root cause was that the OSAccess type (which I wrote) had such a bad interface that I misused it.

We've fixed the immediate problem in the hackmonty codebase, and we'll improve OSAccess soon.

We'll work out the first person or people to identify this and will pay a partial bounty!

Update 2:

The environment variable secret has been found.

We're working with the security researchers who identified the vulnerability to implement comprehensive fix (as well as pay the bounty).

We'll get a fix out in Monty as soon as possible. ~~and publish a full explanation of the bug soon.~~ Full explanation of the bug published here: pydantic.dev/articles/hack-monty-postmortem

What's Monty, again?

Short version: Monty is a minimal Python interpreter, written in Rust, that applications can use to run AI generated code right on the host. It starts from nothing and only exposes the functions you explicitly hand it - the opposite of the usual "container with a blacklist" approach. Startup is measured in microseconds, not seconds, and it can snapshot mid-execution so the host can resolve an external call and resume later.

If you want the longer version:

The repo and README: github.com/pydantic/monty
The original blog post: Pydantic Monty
A talk I gave in March about Monty at PyAI: Monty, from tool calling to computer use

Why a bounty, why now?

Monty has been labelled "experimental" since day one. We think it's ready to lose that badge, so a bounty program where lots of developers try to find vulnerabilities should be a powerful way of testing Monty's security.

Either someone finds something and we fix it, or nobody does and we get to stop hedging in the README. Both outcomes are good.

How to play

The honeypot lives at hackmonty.com.

You can POST Python code to /run/ and it'll be run on the server with no sandbox beyond Monty itself.

curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -d '{"code": "print(1 + 1)"}'

You should try to read either /app/secret.txt or the SECRET environment variable.

The full request/response schemas live in the Swagger docs and Redoc docs.

Do it the easy way: the CLI

Doing the pause/resume dance by hand is tedious. I wrote a small CLI that takes a Python file (plus an optional "globals" file that supplies the external functions the sandbox will ask for) and drives the whole loop for you:

hackmonty CLI gist

uv run hackmonty.py --script my_attack.py --globals my_helpers.py

It distinguishes syntax vs. type vs. runtime errors, handles async calls, and prints each external call and name lookup it resolves - much nicer than hand-rolling curl loops.

Request secret

To let us confirm it was really you who found the secret, you may want to include a User header on your requests. The header value should be the SHA-256 hex digest of some unique secret only you know (a random string, passphrase, UUID — anything). Keep the plaintext to yourself until you report the find; we'll check that its SHA-256 matches a User header we recorded from the winning run and that nobody else beat you to it.

USER_HASH=$(printf 'my-secret-passphrase' | shasum -a 256 | awk '{print $1}')
curl -X POST https://hackmonty.com/run/ \
  -H 'content-type: application/json' \
  -H "User: $USER_HASH" \
  -d '{"code": "print(1 + 1)"}'

The CLI accepts a --user-secret flag that takes your plaintext secret and sends its SHA-256 as the header on every request:

uv run hackmonty.py --user-secret 'my-secret-passphrase' --code 'print(1 + 1)'

Watch what's happening server-side

We've opened up the Logfire project behind hackmonty.com so you can see traces of your own requests (and everyone else's) as they flow through the server: join the Logfire project.

NOTE: this means if you make requests to hackmonty.com/run/* all the data you submit, as well as your IP address and any other headers are visible to anyone looking at the traces.

Note: the server code for hackmonty.com itself is closed source - the bounty is about Monty, not our FastAPI wrapper. Logfire gives you enough visibility to understand what the sandbox is doing without needing the source.

Rules

Read the rules BEFORE you start.

If you do this, or run agents that try to do this, we will block you and report you as a malicious actor. If we find that a PR has been merged anywhere in the dependency tree that introduces a vulnerability related to this bounty, we'll stop the bounty program.

Who we can pay

We'd love contributions from any developer, anywhere - but we can only pay the bounty if you have a bank account in a country approved by GitHub Sponsors, and our bank (Mercury) is able to make a transfer to it. If you find something and we can't legally pay you, we'll still credit you publicly and sort out some swag - but please check the list before you spend a week on this expecting a cheque.

(To be clear, we're not using GitHub sponsors as a means of payment, we're just piggybacking of Microsoft's checks on what countries are legal to make payments to for a US company.)

Full bounty ($5,000 USD)

We will pay you for finding the file or environment variable secret by exploiting a security flaw or vulnerability in Pydantic Monty. You will need to show us the code and technique you used, as well as the secret you extracted.

Partial bounty (amount at our discretion)

A security flaw in this app - a mistake in our server config or code that lets you read the secret.
A security flaw or vulnerability elsewhere in the dependency tree (Pydantic validation, Starlette, Uvicorn, PyO3, …) that lets you read the secret.
A security flaw in Pydantic Logfire where it instruments this app - doesn't have to leak the secret; if you find a vulnerability or access to info that shouldn't be visible, tell us.
A vulnerability in Pydantic Monty that gives you access to or control of the host, but doesn't let you read the secrets (Rust tracebacks, OS details, binary path, network access, reading or writing files you shouldn't have access to, and so on).

No bounty, but we'd still love to hear about it

Crashing Monty with malicious code - panic, stack overflow, segfault, unbounded memory or CPU allocation. Report the issue with the code you used; we'll buy you a drink or a t-shirt if we run into you at a conference. Not part of the bounty program for now.
Bugs or CPython compatibility issues in Monty - please open an issue; not part of the bounty program.
Bugs or vulnerabilities elsewhere in the dependency tree - check whether the issue is new and file it with the upstream project; also not part of this program.

Please don't do any of these

Finding the secret or a vulnerability by changing the code in any library - see the first rule.
"Spear phishing" the Pydantic team or any similar social engineering attack.
Finding a security flaw in Render, where this app is deployed - report that to Render directly.
DoS'ing the app, or otherwise making it unresponsive.
DoS'ing or otherwise causing a service interruption in any other Pydantic service.

Reporting a find

If you've found a vulnerability that you think is worth paying the bounty for, please submit it in this form.

If you want to talk to us, join our community in Slack and send us a message in the #monty channel.

Have fun. Break it hard.

Hack Monty: a $5,000 bounty to break our Python sandbox

Update 1:

Update 2:

What's Monty, again?

Why a bounty, why now?

How to play

Do it the easy way: the CLI

Request secret

Watch what's happening server-side

Rules

Who we can pay

Full bounty ($5,000 USD)

Partial bounty (amount at our discretion)

No bounty, but we'd still love to hear about it

Please don't do any of these

Reporting a find

Related content

Hack Monty 2: a $10,000 bounty to break our Python sandbox

Hack Monty - Postmortem

Explore Logfire

Hack Monty: a $5,000 bounty to break our Python sandbox

#Update 1:

#Update 2:

#What's Monty, again?

#Why a bounty, why now?

#How to play

Do it the easy way: the CLI

Request secret

Watch what's happening server-side

#Rules

Who we can pay

Full bounty ($5,000 USD)

Partial bounty (amount at our discretion)

No bounty, but we'd still love to hear about it

Please don't do any of these

#Reporting a find

Related content

Hack Monty 2: a $10,000 bounty to break our Python sandbox

Hack Monty - Postmortem

Explore Logfire

Update 1:

Update 2:

What's Monty, again?

Why a bounty, why now?

How to play

Rules

Reporting a find