Python ast Module Tutorial: Transform and Run Code at Runtime

I'm about to show you some powerful stuff: how to use the ast module to parse Python code into a structure that you can understand precisely, manipulate to your will, and execute. This lets you achieve things that are otherwise difficult or impossible.

I'll explain at the end why I think this is a valuable skill for you to learn, especially as the world shifts toward AI. I'll share how I've used this in various personal projects, how it's helped me get jobs, and how it provides enhanced observability features in the Pydantic Logfire SDK. But to avoid being like that annoying cliché about online recipes starting with a long life story, I'll get straight to the point first using a simple real-world example.

The motivating example: `eval-type-backport`

The eval-type-backport library makes runtime inspection with the typing module possible with newer syntax in older Python versions. Specifically, it transforms X | Y into typing.Union[X, Y] and list[X] into typing.List[X] etc. if the original syntax is not supported in the current Python version.

For users of pydantic, merely having this package installed should make pydantic models with newer syntax work in older Python versions.

Here's an example of how to use it directly:

from __future__ import annotations
import typing
from eval_type_backport import install_patch

install_patch()


class Foo:
    a: int | str


assert typing.get_type_hints(Foo) == {'a': typing.Union[int, str]}

In Python 3.9 and earlier, without the install_patch, the above code will raise TypeError: unsupported operand type(s) for |: 'type' and 'type'.

I'm going to show you step by step exactly how the interesting part of the library works, especially these lines. The rest of the library is wiring and fiddling with internals of the typing module.

How

Problem statement

Goal: evaluate a string such as 'list[str | int | None] | module.MyClass' such that every occurrence of a | b is treated as Union[a, b] in a robust way.

(Converting e.g. list[str] to List[str] follows roughly the same pattern)

Let's start by defining a helper function safe_or(a, b) that emulates the new a | b behavior as closely as is practical across Python versions:

from typing import Any, Union


def safe_or(a: Any, b: Any):
    try:
        return a | b
    except TypeError as e:
        if str(e).startswith('unsupported operand type(s) for |: '):
            # The error message is something like:
            # "unsupported operand type(s) for |: 'type' and 'type'"
            # The operands could be other things, e.g. None,
            # so we won't check what exactly they are.
            # We'll just assume that it's meant to be a Union.
            return Union[a, b]
        raise


assert safe_or(3, 9) == 3 | 9 == 11
assert safe_or(int, str) == Union[int, str]

New goal: replace all occurrences of a | b in a string with safe_or(a, b).

Don't parse Python or HTML with regex. Use the ast module instead.

The `ast` module

The ast module allows us to parse Python code strings into an abstract syntax tree (AST), which we can then manipulate. Here's what parsing int | str looks like:

import ast

tree = ast.parse('foo = int | str')
print(ast.dump(tree, indent=4))

tree is our AST object. It has a useless repr, so we use ast.dump to get a more readable representation. The indent=4 argument makes the output more readable by adding indentation, but requires Python 3.9 or later. This will output something like:

Module(
    body=[
        Assign(
            targets=[
                Name(id='foo', ctx=Store())],
            value=BinOp(
                left=Name(id='int', ctx=Load()),
                op=BitOr(),
                right=Name(id='str', ctx=Load())))],
    type_ignores=[])

Name(id='foo', ...) is the variable name foo in code. ctx=Store() means a value is being stored in the name, since foo is on the left of the assignment. ctx=Load() means the variable is being loaded, i.e. evaluated.

BinOp(left=<int>, op=BitOr(), right=<str>) represents int | str. The op=BitOr() means the operator is |. A different operator would have a different op value, e.g. Add() for +. This is the kind of thing we want to replace with safe_or(int, str). Let's see what the AST for that looks like:

import ast

tree = ast.parse('safe_or(int, str)', mode='eval')
print(ast.dump(tree, indent=4))

Result:

Expression(
    body=Call(
        func=Name(id='safe_or', ctx=Load()),
        args=[
            Name(id='int', ctx=Load()),
            Name(id='str', ctx=Load())],
        keywords=[]))

This time I added mode='eval' to ast.parse because we are parsing a single expression rather than one or more statements. The default mode is 'exec'. The difference between the two modes corresponds to the difference between the eval() and exec() built-in functions. This will matter more later when we want to evaluate the transformed AST. For now, it just makes the tree a bit simpler: the interesting part is the tree.body rather than tree.body[0].value.

Now we know what structure we're aiming for, here's a simple script that modifies an AST 'manually':

import ast

tree = ast.parse('int | str', mode='eval')
binop = tree.body
assert isinstance(binop, ast.BinOp)
assert isinstance(binop.op, ast.BitOr)
call = ast.Call(
    func=ast.Name(id='safe_or', ctx=ast.Load()),
    args=[binop.left, binop.right],
    keywords=[],
)
assert ast.unparse(call) == 'safe_or(int, str)'

The ast.unparse function (added in Python 3.9) converts an AST back into a string of Python code, which makes it easier to check that we constructed the AST correctly.

Transforming the AST recursively

The typical way to transform an AST is to subclass ast.NodeTransformer, like this:

import ast


class BackportTransformer(ast.NodeTransformer):
    def visit_BinOp(self, node: ast.BinOp):
        if not isinstance(node.op, ast.BitOr):
            return node

        return ast.Call(
            func=ast.Name(id='safe_or', ctx=ast.Load()),
            args=[node.left, node.right],
            keywords=[],
        )


tree = ast.parse('list[list[int | str] | None]', mode='eval')
tree = BackportTransformer().visit(tree)  # recursively transform the tree
print(ast.unparse(tree))

In general, the subclass should implement a visit_<NodeType> method for each node type it wants to transform. In our case, we only care about BinOp nodes, so we implement visit_BinOp. This method is called for every BinOp node in the AST. If the operator is not |, we return the node unchanged. Otherwise, we perform the same modification as before, replacing it with a call to safe_or.

But this script prints list[safe_or(list[int | str], None)]. It hasn't transformed the inner int | str! The problem is that while ast.NodeTransformer recursively visits all nodes by default, it doesn't automatically recurse into nodes visited by a visit_<NodeType> method. Adding self.generic_visit(node) to the start of visit_BinOp fixes this, and we get list[safe_or(list[safe_or(int, str)], None)] as expected.

Evaluating the transformed AST

You could maybe use eval(ast.unparse(tree)) to evaluate the transformed AST, but that wouldn't work before Python 3.9, it would be inefficient, and it's not always correct. And for some reason, you can't pass an AST directly to eval or exec. Instead, you have to compile the AST into a code object first, and then execute that code object:

import ast

tree = ast.parse('1 + 2', mode='eval')
code = compile(tree, filename='<ast>', mode='eval')
result = eval(code)
assert result == 3

The filename argument is used when displaying tracebacks. In our case, displaying a perfect traceback isn't important, so any random string will do. A code object can be passed to either the eval() or exec() builtin functions. Since we want to evaluate a single expression, rather than execute one or more statements, we need to use eval(). Therefore, we need mode='eval' for both ast.parse and compile.

If we try to compile the transformed AST from before, we get TypeError: required field "lineno" missing from expr. Again, for the purpose of tracebacks, every AST node needs location attributes like lineno. The lazy way to add these is to call ast.fix_missing_locations on the transformed tree.

With that in mind, here's the full code:

import ast

from typing import Any, Union, List


def safe_or(a: Any, b: Any):
    try:
        return a | b
    except TypeError as e:
        if str(e).startswith('unsupported operand type(s) for |: '):
            return Union[a, b]
        raise


class BackportTransformer(ast.NodeTransformer):
    def visit_BinOp(self, node: ast.BinOp):
        self.generic_visit(node)  # visit children recursively
        if not isinstance(node.op, ast.BitOr):
            return node

        call = ast.Call(
            func=ast.Name(id='safe_or', ctx=ast.Load()),
            args=[node.left, node.right],
            keywords=[],
        )
        return ast.fix_missing_locations(call)  # ensure lineno etc. are set


tree = ast.parse('List[List[int | str] | None]', mode='eval')
tree = BackportTransformer().visit(tree)
code = compile(tree, filename='<ast>', mode='eval')
result = eval(code)
assert result == List[Union[List[Union[int, str]], None]]

if hasattr(ast, 'unparse'):  # only in Python 3.9+
    assert ast.unparse(tree) == 'List[safe_or(List[safe_or(int, str)], None)]'

Note that eval and exec can take optional globals and locals dictionaries to control the namespace in which the code is executed. They default to the current namespaces, which works for this demo script. In reality you'd use e.g. the dict of the module where the original string came from as the globals. You'd also need to include safe_or in a namespace. To avoid name conflicts, it's best to actually define a unique name, and use that in both the ast.Call and the namespace, e.g. safe_or_<uuid>.

Another example: printing every subexpression

Here's another example of using ast.NodeTransformer to print every subexpression in a string during evaluation:

import ast

source = '(a + b) * c'
namespace = dict(a=1, b=2, c=4)


def debug_node(node_source: str, value):
    print(f'{node_source} == {value!r}')
    return value


class DebugTransformer(ast.NodeTransformer):
    # We want to transform all expression nodes. These are instances of ast.expr.
    # Since there isn't one specific class, we have to override generic_visit.
    # Defining visit_expr wouldn't work.

    def generic_visit(self, node: ast.AST):
        super().generic_visit(node)
        if not isinstance(node, ast.expr):
            return node

        # Get the original source code for this node.
        # Using ast.unparse would give us the modified source, which is not what we want.
        node_source = ast.get_source_segment(source, node)

        return ast.fix_missing_locations(
            # e.g. this transforms the node for `a` into `debug_node('a', a)`
            ast.Call(
                func=ast.Name(id='debug_node', ctx=ast.Load()),
                args=[
                    ast.Constant(value=node_source),
                    node,
                ],
                keywords=[],
            )
        )


tree = ast.parse(source, mode='eval')
tree = DebugTransformer().visit(tree)
code = compile(tree, filename='<ast>', mode='eval')
result = eval(code, dict(**namespace, debug_node=debug_node))
assert result == eval(source, namespace)

This prints:

a == 1
b == 2
a + b == 3
c == 4
(a + b) * c == 12

This implementation is a bit basic and has some flaws, but you should be able to fully understand how it works and be able to see how this could be useful. To use something like this in practice, I recommend trying birdseye or the pp.deep function in snoop, which are both Python debugging libraries I've written using similar techniques.

Why

The backstory

I first learned how to work with the AST to write birdseye. It uses the library asttokens, so I made many contributions to that library in the process. This impressed the creator of asttokens and led to a job at his company Grist. Their product is a spreadsheet/database hybrid with formulas written in a slightly customized Python. Understanding the nuances of the AST and other Python internals was important here and there.

The AST was also essential to several other projects of mine, including executing. As mentioned in my previous blog post, executing powers the inline-snapshot library and the f-string magic feature of the Logfire SDK. But long before that, I used executing and asttokens to fix several issues in python-devtools, a library by Samuel Colvin, the creator of Pydantic.

Years later, I was looking for a new job. I saw that Pydantic had formed a company, but I had missed the call for hiring. I decided to try and impress them anyway and started making contributions to the pydantic library. And it worked! When an opening appeared, Samuel reached out to me, because he believed that my deep knowledge of Python would help make Logfire the best platform for observability.

One way I'd demonstrated this knowledge was the contribution to python-devtools mentioned above. The other biggest factor was probably creating eval-type-backport specifically to integrate it into Pydantic and make newer type annotation syntax just magically work in older Python versions.

One of the first things I did in my new job was to reimplement Logfire's auto-tracing feature using AST modification (roughly wrapping every function body in with logfire.span(...):) and import hooks. These were tricks I'd used in birdseye, and I'll explain more in a future blog post.

As you can see, these are valuable techniques with real applications, especially for impressing employers!

Why you should learn this

You may be wondering when you'd ever apply this. Maybe your work involves writing 'normal' code that works with things like databases and web APIs, not writing the kind of libraries that Pydantic writes. But the problem is that there's lots of 'normal' code out there for AI to train on. It's becoming more important to learn advanced skills that are harder to automate. I think metaprogramming is a good option.

This stuff is also potentially very useful if you're working with AI agents, regardless of what those agents do. One of the newest and biggest features of Pydantic AI is Code Mode, which lets agents call tools by running simple Python code in the Monty sandbox instead of calling one tool at a time with JSON. AI is good at writing Python, so this helps an agent to express itself rigorously and efficiently. Being able to inspect and modify the code it produces is a good way to understand and control it. For example, imagine plugging in the DebugTransformer above so that an agent can analyze every detail of the execution of its code and spot problems.

This isn't just hypothetical. One of Grist's biggest features is an AI assistant that generates Python spreadsheet formulas. I implemented the first version of this feature. Back then, it used GPT-3! Getting it to reliably produce good output was hard, and various post-processing helped a lot. Some of that used the ast module.

For a more dramatic example, consider Astral. They were building completely free open source software without clear concrete plans for how to monetize this. Then OpenAI announced plans to acquire them:

By bringing Astral’s tooling and engineering expertise to OpenAI, we will accelerate our work on Codex and expand what AI can do across the software development lifecycle.

I think this is more about the tooling and expertise than it is about Codex or software development. All AI agents, not just coding agents, can benefit from being able to perform actions by writing code. It turns a fuzzy unpredictable process that produces unstructured text into something precise, logical, and controllable. Otherwise, your options for things like guardrails are very limited. If the output is natural language, then automatically inspecting it requires either simple brittle heuristics or relying on more AI. Simple structured data like JSON is better, but still has obvious limitations.

You could express arbitrary complex logic in JSON. But have you ever used software written entirely in JSON? No one does that. Therefore, neither does AI. It's much better for it to write that logic in one of the most popular programming languages so that it has lots of training data and so that people can read its output. If you wanted the best coding agent, you'd need it to be good at many different languages. Python is not a good choice for a lot of software development. But it's excellent for short and simple scripts. So it's an ideal output format for agents in a wide range of contexts. That's why OpenAI thinks good Python tooling is so valuable.

Future Python metaprogramming blog posts

Inspired by the success of my previous blog post about inline-snapshot, I decided to write a series of blog posts about Python metaprogramming techniques. This is the second post in that series. Future posts will cover things like import hooks, tracing, frames, and tracebacks. The goal will be for you to learn:

How Python works at a deeper level,
How to use advanced techniques to do cool things yourself,
What libraries you can use that have done the hard work for you, and
How this fits into real world contexts such as at Pydantic.

Hopefully, I don't end up regretting diluting my competitive advantage.

How and why to run modified Python code using the ast module

The motivating example: `eval-type-backport`

How

Problem statement

The `ast` module

Transforming the AST recursively

Evaluating the transformed AST

Another example: printing every subexpression

Why

The backstory

Why you should learn this

Future Python metaprogramming blog posts

Related content

A production-ready full-stack template for AI agents with Pydantic AI and Logfire

Pydantic AI v2: capable agentic loops

Explore Logfire

How and why to run modified Python code using the ast module

#The motivating example: eval-type-backport

#How

#Problem statement

#The ast module

#Transforming the AST recursively

#Evaluating the transformed AST

#Another example: printing every subexpression

#Why

#The backstory

#Why you should learn this

#Future Python metaprogramming blog posts

Related content

A production-ready full-stack template for AI agents with Pydantic AI and Logfire

Pydantic AI v2: capable agentic loops

Explore Logfire

The motivating example: `eval-type-backport`

How

Problem statement

The `ast` module

Transforming the AST recursively

Evaluating the transformed AST

Another example: printing every subexpression

Why

The backstory

Why you should learn this

Future Python metaprogramming blog posts