# Pydantic AI

> GenAI Agent Framework, the Pydantic way

Pydantic AI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.

---

# [Image, Audio, Video & Document Input](https://pydantic.dev/docs/ai/advanced-features/input/)

# Image, Audio, Video & Document Input

## Image Input

Note

Some models do not support image input. Please check the model's documentation to confirm whether it supports image input.

If you have a direct URL for the image, you can use [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl):

image\_input.py

```python
from pydantic_ai import Agent, ImageUrl

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync(
    [
        'What company is this logo from?',
        ImageUrl(url='https://iili.io/3Hs4FMg.png'),
    ]
)
print(result.output)
#> This is the logo for Pydantic, a data validation and settings management library in Python.
```

If you have the image locally, you can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent):

local\_image\_input.py

```python
import httpx

from pydantic_ai import Agent, BinaryContent

image_response = httpx.get('https://iili.io/3Hs4FMg.png')  # Pydantic logo

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync(
  [
      'What company is this logo from?',
      BinaryContent(data=image_response.content, media_type='image/png'),  # (1)
  ]
)
print(result.output)
#> This is the logo for Pydantic, a data validation and settings management library in Python.
```

To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents.

## Audio Input

Note

Some models do not support audio input. Please check the model's documentation to confirm whether it supports audio input.

You can provide audio input using either [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above.

## Video Input

Note

Some models do not support video input. Please check the model's documentation to confirm whether it supports video input.

You can provide video input using either [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is analogous to the examples above.

## Document Input

Note

Some models do not support document input. Please check the model's documentation to confirm whether it supports document input.

You can provide document input using either [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) or [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent). The process is similar to the examples above.

If you have a direct URL for the document, you can use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl):

document\_input.py

```python
from pydantic_ai import Agent, DocumentUrl

agent = Agent(model='anthropic:claude-sonnet-4-6')
result = agent.run_sync(
    [
        'What is the main content of this document?',
        DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'),
    ]
)
print(result.output)
#> This document is the technical report introducing Gemini 1.5, Google's latest large language model...
```

The supported document formats vary by model.

You can also use [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) to pass document data directly:

binary\_content\_input.py

```python
from pathlib import Path
from pydantic_ai import Agent, BinaryContent

pdf_path = Path('document.pdf')
agent = Agent(model='anthropic:claude-sonnet-4-6')
result = agent.run_sync(
    [
        'What is the main content of this document?',
        BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'),
    ]
)
print(result.output)
#> The document discusses...
```

Tip

If neither `DocumentUrl` nor `BinaryContent` is suitable for your use case (e.g., the model doesn't support `DocumentUrl`, or you want to provide a document in a non-binary format), you can still provide document content as text input by extracting the text yourself and passing it as a string or [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent).

## Text Input

You can use [`TextContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextContent) to provide text input with additional metadata:

text\_content\_input.py

```python
from pydantic_ai import Agent, TextContent

agent = Agent(model='openai:gpt-5.2')
result = agent.run_sync([
    'Summarize the key points from this text.',
    TextContent(
        content=(
            'Pydantic AI is a Python agent framework. '
            'It supports text, image, audio, video, and document input.'
        ),
        metadata={'source': 'pydantic_ai_inputs.txt'},
    ),
])
```

This is equivalent to passing the text as a `str`, but allows you to include additional `metadata` that can be accessed programmatically in your agent logic.

Note

The `content` field is treated as input to the model, but the `metadata` is **not sent to the model**. It is preserved in messages for programmatic access.

## User-side download vs. direct file URL

When using one of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI will default to sending the URL to the model provider, so the file is downloaded on their side.

Support for file URLs varies depending on type and provider:

Model

Send URL directly

Download and send bytes

Unsupported

[`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel)

`ImageUrl`

`AudioUrl`, `DocumentUrl`

`VideoUrl`. `DocumentUrl` [not supported with `AzureProvider`](/docs/ai/models/openai#using-azure-with-the-responses-api)

[`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel)

`ImageUrl`, `AudioUrl`, `DocumentUrl`

--

`VideoUrl`

[`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel)

`ImageUrl`, `DocumentUrl` (PDF)

`DocumentUrl` (`text/plain`)

`AudioUrl`, `VideoUrl`

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (Vertex)

All URL types

--

--

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (GLA)

[YouTube](/docs/ai/models/google#document-image-audio-and-video-input), [Files API](/docs/ai/models/google#document-image-audio-and-video-input)

All other URLs

--

[`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel)

`ImageUrl`

`DocumentUrl`

`AudioUrl`, `VideoUrl`

[`MistralModel`](/docs/ai/api/models/mistral/#pydantic_ai.models.mistral.MistralModel)

`ImageUrl`, `DocumentUrl` (PDF)

--

`AudioUrl`, `VideoUrl`, `DocumentUrl` (non-PDF)

[`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel)

S3 URLs (`s3://`)

`ImageUrl`, `DocumentUrl`, `VideoUrl`

`AudioUrl`

[`OpenRouterModel`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModel)

`ImageUrl`, `DocumentUrl`, `VideoUrl`

`AudioUrl`

--

A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) on Vertex AI limits YouTube video URLs to one URL per request.

In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object:

force\_download.py

```py
from pydantic_ai import ImageUrl, AudioUrl, VideoUrl, DocumentUrl

ImageUrl(url='https://example.com/image.png', force_download=True)
AudioUrl(url='https://example.com/audio.mp3', force_download=True)
VideoUrl(url='https://example.com/video.mp4', force_download=True)
DocumentUrl(url='https://example.com/doc.pdf', force_download=True)
```

Trust model for file URLs

When URLs are forwarded to the provider, the provider fetches them under its own credentials. For cloud-storage schemes like `s3://` (Bedrock) and `gs://` (Vertex AI), those credentials are your server's IAM role or service account, so whoever controls the URL effectively controls what the provider can read on your behalf.

Don't construct [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl), [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl), or [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) from untrusted user input without validating the scheme and scope. For frontend-initiated uploads to cloud storage, convert references like `s3://bucket/key` into pre-signed `https://` URLs server-side before constructing the file URL part. `force_download=True` only works for `http(s)://` URLs (it routes through the library's HTTP client and applies SSRF protection); cloud-storage schemes like `s3://` and `gs://` aren't supported by the local download path and are forwarded to the provider as-is.

The [UI adapters](/docs/ai/integrations/ui/overview) apply this sanitization automatically to client-submitted messages via [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes).

## Uploaded Files

Some model providers have their own file storage APIs where you can upload files and reference them by ID or URL.

Use [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) to reference files that have been uploaded to a provider's file storage API.

Tip

For providers that return a file URL (like Google Files API or S3 URLs for Bedrock), you can also use [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl), [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), or [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl) directly. However, we recommend using `UploadedFile` for a unified API across providers and consistent provider name validation.

### Supported Models

Model

Support

[`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel)

✅ via [Anthropic Files API](https://docs.anthropic.com/en/docs/build-with-claude/files)

[`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel)

✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files)

[`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel)

✅ via [OpenAI Files API](https://platform.openai.com/docs/api-reference/files)

[`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel)

✅ via [Google Files API](https://ai.google.dev/gemini-api/docs/files)

[`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel)

✅ via S3 URLs (`s3://bucket/key`)

[`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel)

✅ via [xAI Files API](https://docs.x.ai/docs/guides/files)

Other models

❌ Not supported

### Provider Name Requirement

When using [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile) you must set the `provider_name`. Uploaded files are specific to the system they are uploaded to and are not transferable across providers. Trying to use a message that contains an `UploadedFile` with a different provider will result in an error.

Getting the provider name

Use [`model.system`](/docs/ai/api/models/base/#pydantic_ai.models.Model.system) to get the correct provider name dynamically. This ensures your code works correctly even if the provider name changes. All examples below demonstrate this pattern.

If you want to introduce portability into your agent logic to allow the same prompt history to work with different provider backends, you can use a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) to remove or rewrite `UploadedFile` parts from messages before sending them to a provider that does not support them. Be aware that stripping out `UploadedFile` instances might confuse the model, especially if references to those files remain in the text.

### Media Type Inference

The `media_type` parameter is optional for [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile). If not specified, Pydantic AI will attempt to infer it from the `file_id`:

1.  If `file_id` is a URL or path with a recognizable file extension (e.g., `.pdf`, `.png`), the media type is inferred automatically
2.  For opaque file IDs (e.g., `'file-abc123'`), the media type defaults to `'application/octet-stream'`

Tip

While `media_type` is optional, we recommend explicitly setting it when known to ensure correct handling by the model provider.

### Anthropic

Follow the [Anthropic Files API docs](https://docs.anthropic.com/en/docs/build-with-claude/files) to upload files. You can access the underlying Anthropic client via `provider.client`.

Beta Feature

The Anthropic Files API is currently in beta. You need to include the beta header `anthropic-beta: files-api-2025-04-14` when making requests.

uploaded\_file\_anthropic.py

```py
import asyncio

from pydantic_ai import Agent, ModelSettings, UploadedFile
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider


async def main():
    provider = AnthropicProvider()
    model = AnthropicModel('claude-sonnet-4-5', provider=provider)

    # Upload a file using the provider's client (Anthropic client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.beta.files.upload(file=f)

    # Reference the uploaded file, including the required beta header
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ],
        model_settings=ModelSettings(extra_headers={'anthropic-beta': 'files-api-2025-04-14'}),
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### OpenAI

Follow the [OpenAI Files API docs](https://platform.openai.com/docs/api-reference/files/create) to upload files. You can access the underlying OpenAI client via `provider.client`.

uploaded\_file\_openai.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider


async def main():
    provider = OpenAIProvider()
    model = OpenAIChatModel('gpt-5', provider=provider)

    # Upload a file using the provider's client (OpenAI client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.files.create(file=f, purpose='user_data')

    # Reference the uploaded file
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### Google

Follow the [Google Files API docs](https://ai.google.dev/gemini-api/docs/files) to upload files. You can access the underlying Google GenAI client via `provider.client`.

uploaded\_file\_google.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider


async def main():
    provider = GoogleProvider()
    model = GoogleModel('gemini-2.5-flash', provider=provider)

    # Upload a file using the provider's client (Google GenAI client)
    with open('document.pdf', 'rb') as f:
        file = await provider.client.aio.files.upload(file=f)
        assert file.uri is not None

    # Reference the uploaded file by URI (media_type is optional for Google)
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=file.uri, media_type=file.mime_type, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

### Bedrock (S3)

For Bedrock, files must be uploaded to S3 separately (e.g., using [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/put_object.html)). The assumed role must have `s3:GetObject` permission on the bucket.

`media_type` may be required

Bedrock requires `media_type` when the file extension is ambiguous or missing. For S3 URLs with clear extensions like `.pdf`, `.png`, etc., it can be inferred automatically.

uploaded\_file\_bedrock.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.bedrock import BedrockConverseModel


async def main():
    model = BedrockConverseModel('us.anthropic.claude-sonnet-4-20250514-v1:0')

    agent = Agent(model)
    result = await agent.run([
        'Summarize this document',
        UploadedFile(
            file_id='s3://my-bucket/document.pdf',
            provider_name=model.system,  # 'bedrock'
            media_type='application/pdf',  # Optional for .pdf, but recommended
        ),
    ])
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

Note

You can optionally specify a `bucketOwner` query parameter if the bucket is not owned by the account making the request: `s3://my-bucket/document.pdf?bucketOwner=123456789012`

### xAI

Follow the [xAI Files API docs](https://docs.x.ai/docs/guides/files) to upload files. You can access the underlying xAI client via `provider.client`.

uploaded\_file\_xai.py

```py
import asyncio

from pydantic_ai import Agent, UploadedFile
from pydantic_ai.models.xai import XaiModel
from pydantic_ai.providers.xai import XaiProvider


async def main():
    provider = XaiProvider()
    model = XaiModel('grok-4-fast', provider=provider)

    # Upload a file using the provider's client (xAI client)
    with open('document.pdf', 'rb') as f:
        uploaded_file = await provider.client.files.upload(f, filename='document.pdf')

    # Reference the uploaded file
    agent = Agent(model)
    result = await agent.run(
        [
            'Summarize this document',
            UploadedFile(file_id=uploaded_file.id, provider_name=model.system),
        ]
    )
    print(result.output)
    #> The document discusses the main topics and key findings...


asyncio.run(main())
```

---

# [HTTP Request Retries](https://pydantic.dev/docs/ai/advanced-features/retries/)

# HTTP Request Retries

Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports. This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors.

The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client.

## Installation

To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group:

-   [pip](#tab-panel-4)
-   [uv](#tab-panel-5)

Terminal

```bash
pip install 'pydantic-ai-slim[retries]'
```

Terminal

```bash
uv add 'pydantic-ai-slim[retries]'
```

## Usage Example

Here's an example of adding retry functionality with smart retry handling:

smart\_retry\_example.py

```python
from httpx import AsyncClient, HTTPStatusError
from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after


def create_retrying_client():
    """Create a client with smart retry handling for multiple error types."""

    def should_retry_status(response):
        """Raise exceptions for retryable HTTP status codes."""
        if response.status_code in (429, 502, 503, 504):
            response.raise_for_status()  # This will raise HTTPStatusError

    transport = AsyncTenacityTransport(
        config=RetryConfig(
            # Retry on HTTP errors and connection issues
            retry=retry_if_exception_type((HTTPStatusError, ConnectionError)),
            # Smart waiting: respects Retry-After headers, falls back to exponential backoff
            wait=wait_retry_after(
                fallback_strategy=wait_exponential(multiplier=1, max=60),
                max_wait=300
            ),
            # Stop after 5 attempts
            stop=stop_after_attempt(5),
            # Re-raise the last exception if all retries fail
            reraise=True
        ),
        validate_response=should_retry_status
    )
    return AsyncClient(transport=transport)

# Use the retrying client with a model
client = create_retrying_client()
model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

## Wait Strategies

### wait\_retry\_after

The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers:

wait\_strategy\_example.py

```python
from tenacity import wait_exponential

from pydantic_ai.retries import wait_retry_after

# Basic usage - respects Retry-After headers, falls back to exponential backoff
wait_strategy_1 = wait_retry_after()

# Custom configuration
wait_strategy_2 = wait_retry_after(
    fallback_strategy=wait_exponential(multiplier=2, max=120),
    max_wait=600  # Never wait more than 10 minutes
)
```

This wait strategy:

-   Automatically parses `Retry-After` headers from HTTP 429 responses
-   Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`)
-   Falls back to your chosen strategy when no header is present
-   Respects the `max_wait` limit to prevent excessive delays

## Transport Classes

### AsyncTenacityTransport

For asynchronous HTTP clients (recommended for most use cases):

async\_transport\_example.py

```python
from httpx import AsyncClient
from tenacity import stop_after_attempt

from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig


def validator(response):
    """Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
    Without a response validator, only network errors and timeouts will result in a retry.
    """
    response.raise_for_status()

# Create the transport
transport = AsyncTenacityTransport(
    config=RetryConfig(stop=stop_after_attempt(3), reraise=True),
    validate_response=validator
)

# Create a client using the transport:
client = AsyncClient(transport=transport)
```

### TenacityTransport

For synchronous HTTP clients:

sync\_transport\_example.py

```python
from httpx import Client
from tenacity import stop_after_attempt

from pydantic_ai.retries import RetryConfig, TenacityTransport


def validator(response):
    """Treat responses with HTTP status 4xx/5xx as failures that need to be retried.
    Without a response validator, only network errors and timeouts will result in a retry.
    """
    response.raise_for_status()

# Create the transport
transport = TenacityTransport(
    config=RetryConfig(stop=stop_after_attempt(3), reraise=True),
    validate_response=validator
)

# Create a client using the transport
client = Client(transport=transport)
```

## Common Retry Patterns

### Rate Limit Handling with Retry-After Support

rate\_limit\_handling.py

```python
from httpx import AsyncClient, HTTPStatusError
from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential

from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after


def create_rate_limit_client():
    """Create a client that respects Retry-After headers from rate limiting responses."""
    transport = AsyncTenacityTransport(
        config=RetryConfig(
            retry=retry_if_exception_type(HTTPStatusError),
            wait=wait_retry_after(
                fallback_strategy=wait_exponential(multiplier=1, max=60),
                max_wait=300  # Don't wait more than 5 minutes
            ),
            stop=stop_after_attempt(10),
            reraise=True
        ),
        validate_response=lambda r: r.raise_for_status()  # Raises HTTPStatusError for 4xx/5xx
    )
    return AsyncClient(transport=transport)

# Example usage
client = create_rate_limit_client()
# Client is now ready to use with any HTTP requests and will respect Retry-After headers
```

The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff.

### Network Error Handling

network\_error\_handling.py

```python
import httpx
from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential

from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig


def create_network_resilient_client():
    """Create a client that handles network errors with retries."""
    transport = AsyncTenacityTransport(
        config=RetryConfig(
            retry=retry_if_exception_type((
                httpx.TimeoutException,
                httpx.ConnectError,
                httpx.ReadError
            )),
            wait=wait_exponential(multiplier=1, max=10),
            stop=stop_after_attempt(3),
            reraise=True
        )
    )
    return httpx.AsyncClient(transport=transport)

# Example usage
client = create_network_resilient_client()
# Client will now retry on timeout, connection, and read errors
```

### Custom Retry Logic

custom\_retry\_logic.py

```python
import httpx
from tenacity import retry_if_exception, stop_after_attempt, wait_exponential

from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after


def create_custom_retry_client():
    """Create a client with custom retry logic."""
    def custom_retry_condition(exception):
        """Custom logic to determine if we should retry."""
        if isinstance(exception, httpx.HTTPStatusError):
            # Retry on server errors but not client errors
            return 500 <= exception.response.status_code < 600
        return isinstance(exception, httpx.TimeoutException | httpx.ConnectError)

    transport = AsyncTenacityTransport(
        config=RetryConfig(
            retry=retry_if_exception(custom_retry_condition),
            # Use wait_retry_after for smart waiting on rate limits,
            # with custom exponential backoff as fallback
            wait=wait_retry_after(
                fallback_strategy=wait_exponential(multiplier=2, max=30),
                max_wait=120
            ),
            stop=stop_after_attempt(5),
            reraise=True
        ),
        validate_response=lambda r: r.raise_for_status()
    )
    return httpx.AsyncClient(transport=transport)

client = create_custom_retry_client()
# Client will retry server errors (5xx) and network errors, but not client errors (4xx)
```

## Using with Different Providers

The retry transports work with any provider that accepts a custom HTTP client:

### OpenAI

openai\_with\_retries.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

### Anthropic

anthropic\_with\_retries.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = AnthropicModel('claude-sonnet-4-5-20250929', provider=AnthropicProvider(http_client=client))
agent = Agent(model)
```

### Any OpenAI-Compatible Provider

openai\_compatible\_with\_retries.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIChatModel(
    'your-model-name',  # Replace with actual model name
    provider=OpenAIProvider(
        base_url='https://api.example.com/v1',  # Replace with actual API URL
        api_key='your-api-key',  # Replace with actual API key
        http_client=client
    )
)
agent = Agent(model)
```

## Best Practices

1.  **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times.
    
2.  **Use Exponential Backoff**: This helps avoid overwhelming servers during outages.
    
3.  **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times.
    
4.  **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible.
    
5.  **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.)
    
6.  **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns.
    

Monitoring Retries in Production

Excessive retries can indicate underlying issues and increase costs. [Logfire](/docs/ai/integrations/logfire) helps you track retry patterns:

-   See which requests triggered retries
-   Understand retry causes (rate limits, server errors, timeouts)
-   Monitor retry frequency over time
-   Identify opportunities to reduce retries

With [HTTPX instrumentation](/docs/ai/integrations/logfire#monitoring-http-requests) enabled, retry attempts are automatically captured in your traces.

## Error Handling

The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application:

error\_handling\_example.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

from smart_retry_example import create_retrying_client

client = create_retrying_client()
model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client))
agent = Agent(model)
```

## Performance Considerations

-   Retries add latency to requests, especially with exponential backoff
-   Consider the total timeout for your application when configuring retry behavior
-   Monitor retry rates to detect systemic issues
-   Use async transports for better concurrency when handling multiple requests

For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/).

## Provider-Specific Retry Behavior

### AWS Bedrock

The AWS Bedrock provider uses boto3's built-in retry mechanisms instead of httpx. To configure retries for Bedrock, use boto3's `Config`:

```python
from botocore.config import Config

config = Config(retries={'max_attempts': 5, 'mode': 'adaptive'})
```

See [Bedrock: Configuring Retries](/docs/ai/models/bedrock#configuring-retries) for complete examples.

---

# [Thinking](https://pydantic.dev/docs/ai/advanced-features/thinking/)

# Thinking

Thinking (or reasoning) is the process by which a model works through a problem step-by-step before providing its final answer.

This capability is typically disabled by default and depends on the specific model being used. See the sections below for how to enable thinking for each provider.

## Unified thinking settings

The simplest way to enable thinking across any supported provider is the `thinking` field in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings):

unified\_thinking.py

```python
from pydantic_ai import Agent

agent = Agent('anthropic:claude-opus-4-7', model_settings={'thinking': 'high'})
```

Or using the [`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) capability:

thinking\_capability.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import Thinking

agent = Agent('anthropic:claude-opus-4-7', capabilities=[Thinking(effort='high')])
```

The `thinking` setting accepts:

-   `True` -- enable thinking with the provider's default effort level
-   `False` -- disable thinking (silently ignored on always-on models)
-   `'minimal'` / `'low'` / `'medium'` / `'high'` / `'xhigh'` -- enable thinking at a specific effort level (unsupported levels map to the closest available value)

When omitted, the model uses its default behavior. Provider-specific settings (documented in the sections below) take precedence when both are set.

### Provider translation

The unified `thinking` setting maps to each provider's native format:

Provider

`thinking=True`

`thinking='high'`

Notes

Anthropic (Opus 4.6+)

`anthropic_thinking={'type': 'adaptive'}`

`{type: 'adaptive'}` + `effort='high'`

Claude Opus 4.7 also supports `effort='xhigh'`

Anthropic (older)

`anthropic_thinking={'type': 'enabled', 'budget_tokens': 10000}`

`budget_tokens=16384`

Budget-based; `'low'` → 2048 tokens

OpenAI

`reasoning_effort='medium'`

`reasoning_effort='high'`

Google (Gemini 3+)

`include_thoughts=True`

`thinking_level='HIGH'`

Google (Gemini 2.5)

`include_thoughts=True`

`thinking_budget=24576`

Groq

`reasoning_format='parsed'`

`reasoning_format='parsed'`

`thinking=False` → `'hidden'` (no true disable)

OpenRouter

`reasoning.effort='medium'`

`reasoning.effort='high'`

Via `extra_body`

Cerebras

`disable_reasoning=False`

`disable_reasoning=False`

`thinking=False` → `disable_reasoning=True`

xAI

`reasoning_effort='high'`

`reasoning_effort='high'`

Only `'low'` and `'high'`

Bedrock (Claude)

`thinking.type='enabled'`

`budget_tokens=16384`

No adaptive support

Bedrock (OpenAI)

`reasoning_effort='medium'`

`reasoning_effort='high'`

## OpenAI

When using the [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), text output inside `<think>` tags are converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects. You can customize the tags using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field on the [model profile](/docs/ai/models/openai#model-profile).

Some [OpenAI-compatible model providers](/docs/ai/models/openai#openai-compatible-models) might also support native thinking parts that are not delimited by tags. Instead, they are sent and received as separate, custom fields in the API. Typically, if you are calling the model via the `<provider>:<model>` shorthand, Pydantic AI handles it for you. Nonetheless, you can still configure the fields with [`openai_chat_thinking_field`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile.openai_chat_thinking_field).

If your provider recommends to send back these custom fields not changed, for caching or interleaved thinking benefits, you can also achieve this with [`openai_chat_send_back_thinking_parts`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile.openai_chat_send_back_thinking_parts).

### OpenAI Responses

The [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel) can generate native thinking parts. To enable this functionality, you need to set the `OpenAIResponsesModelSettings.openai_reasoning_effort` and [`OpenAIResponsesModelSettings.openai_reasoning_summary`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_reasoning_summary) [model settings](/docs/ai/core-concepts/agent#model-run-settings).

By default, the unique IDs of reasoning, text, and function call parts from the message history are sent to the model, which can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](/docs/ai/core-concepts/message-history#processing-message-history). To disable this, you can disable the [`OpenAIResponsesModelSettings.openai_send_reasoning_ids`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_send_reasoning_ids) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

openai\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model = OpenAIResponsesModel('gpt-5.2')
settings = OpenAIResponsesModelSettings(
    openai_reasoning_effort='low',
    openai_reasoning_summary='detailed',
)
agent = Agent(model, model_settings=settings)
...
```

Raw reasoning without summaries

Some OpenAI-compatible APIs (such as LM Studio, vLLM, or OpenRouter with gpt-oss models) may return raw reasoning content without reasoning summaries. In this case, [`ThinkingPart.content`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart.content) will be empty, but the raw reasoning is available in `provider_details['raw_content']`. Following [OpenAI's guidance](https://cookbook.openai.com/examples/responses_api/reasoning_items) that raw reasoning should not be shown directly to users, we store it in `provider_details` rather than in the main `content` field.

## Anthropic

To enable thinking, use the [`AnthropicModelSettings.anthropic_thinking`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_thinking) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

Note

Extended thinking (`type: 'enabled'` with `budget_tokens`) is deprecated on `claude-opus-4-6` and removed on `claude-opus-4-7`+. For those models, use [adaptive thinking](#adaptive-thinking--effort) instead.

anthropic\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings

model = AnthropicModel('claude-sonnet-4-5')
settings = AnthropicModelSettings(
    anthropic_thinking={'type': 'enabled', 'budget_tokens': 1024},
)
agent = Agent(model, model_settings=settings)
...
```

### Interleaved Thinking

To enable [interleaved thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#interleaved-thinking), you need to include the beta header in your model settings:

anthropic\_interleaved\_thinking.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings

model = AnthropicModel('claude-sonnet-4-5')
settings = AnthropicModelSettings(
    anthropic_thinking={'type': 'enabled', 'budget_tokens': 10000},
    extra_headers={'anthropic-beta': 'interleaved-thinking-2025-05-14'},
)
agent = Agent(model, model_settings=settings)
...
```

### Adaptive Thinking & Effort

Starting with `claude-opus-4-6`, Anthropic supports [adaptive thinking](https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking), where the model dynamically decides when and how much to think based on the complexity of each request. This replaces extended thinking (`type: 'enabled'` with `budget_tokens`) which is deprecated on Opus 4.6 and removed on Opus 4.7. Claude Opus 4.7 also adds the `xhigh` effort level. Adaptive thinking also automatically enables interleaved thinking.

anthropic\_adaptive\_thinking.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings

model = AnthropicModel('claude-opus-4-7')
settings = AnthropicModelSettings(
    anthropic_thinking={'type': 'adaptive'},
    anthropic_effort='high',
)
agent = Agent(model, model_settings=settings)
...
```

The [`anthropic_effort`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_effort) setting controls how much effort the model puts into its response (independent of thinking). See the [Anthropic effort docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for details.

Note

Older models (`claude-sonnet-4-5`, `claude-opus-4-5`, etc.) do not support adaptive thinking and require `{'type': 'enabled', 'budget_tokens': N}` as shown [above](#anthropic).

Thinking tokens count against Anthropic's loop-wide [task budgets](/docs/ai/models/anthropic#task-budgets-beta), so adaptive thinking naturally scales down as the budget depletes.

## Google

To enable thinking, use the [`GoogleModelSettings.google_thinking_config`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_thinking_config) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

google\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings

model = GoogleModel('gemini-3-pro-preview')
settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True})
agent = Agent(model, model_settings=settings)
...
```

## xAI

xAI reasoning models (Grok) support native thinking. To preserve the thinking content for multi-turn conversations, enable [`XaiModelSettings.xai_include_encrypted_content`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_encrypted_content).

xai\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.xai import XaiModel, XaiModelSettings

model = XaiModel('grok-4-fast-reasoning')
settings = XaiModelSettings(xai_include_encrypted_content=True)
agent = Agent(model, model_settings=settings)
...
```

## Bedrock

Although Bedrock Converse doesn't provide a unified API to enable thinking, you can still use [`BedrockModelSettings.bedrock_additional_model_requests_fields`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_additional_model_requests_fields) [model setting](/docs/ai/core-concepts/agent#model-run-settings) to pass provider-specific configuration:

-   [Claude](#tab-panel-0)
-   [OpenAI](#tab-panel-1)
-   [Qwen](#tab-panel-2)
-   [Deepseek](#tab-panel-3)

bedrock\_claude\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings

model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0')
model_settings = BedrockModelSettings(
    bedrock_additional_model_requests_fields={
        'thinking': {'type': 'enabled', 'budget_tokens': 1024}
    }
)
agent = Agent(model=model, model_settings=model_settings)

```

bedrock\_openai\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings

model = BedrockConverseModel('openai.gpt-oss-120b-1:0')
model_settings = BedrockModelSettings(
    bedrock_additional_model_requests_fields={'reasoning_effort': 'low'}
)
agent = Agent(model=model, model_settings=model_settings)

```

bedrock\_qwen\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings

model = BedrockConverseModel('qwen.qwen3-32b-v1:0')
model_settings = BedrockModelSettings(
    bedrock_additional_model_requests_fields={'reasoning_config': 'high'}
)
agent = Agent(model=model, model_settings=model_settings)

```

Reasoning is [always enabled](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-reasoning.html) for Deepseek model

bedrock\_deepseek\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel

model = BedrockConverseModel('us.deepseek.r1-v1:0')
agent = Agent(model=model)

```

## Groq

Groq supports different formats to receive thinking parts:

-   `"raw"`: The thinking part is included in the text content inside `<think>` tags, which are automatically converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects.
-   `"hidden"`: The thinking part is not included in the text content.
-   `"parsed"`: The thinking part has its own structured part in the response which is converted into a [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) object.

To enable thinking, use the [`GroqModelSettings.groq_reasoning_format`](/docs/ai/api/models/groq/#pydantic_ai.models.groq.GroqModelSettings.groq_reasoning_format) [model setting](/docs/ai/core-concepts/agent#model-run-settings):

groq\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel, GroqModelSettings

model = GroqModel('qwen/qwen3-32b')
settings = GroqModelSettings(groq_reasoning_format='parsed')
agent = Agent(model, model_settings=settings)
...
```

Note

Groq does not support truly disabling thinking. When `thinking=False` is set via the unified setting, Pydantic AI sends `reasoning_format='hidden'`, which suppresses reasoning output but the model may still reason internally.

## OpenRouter

To enable thinking, use the [`OpenRouterModelSettings.openrouter_reasoning`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModelSettings.openrouter_reasoning) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

openrouter\_thinking\_part.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings

model = OpenRouterModel('openai/gpt-5.2')
settings = OpenRouterModelSettings(openrouter_reasoning={'effort': 'high'})
agent = Agent(model, model_settings=settings)
...
```

## Mistral

Thinking is supported by the `magistral` family of models. It does not need to be specifically enabled.

## Cohere

Thinking is supported by the `command-a-reasoning-08-2025` model. It does not need to be specifically enabled.

## Hugging Face

Text output inside `<think>` tags is automatically converted to [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) objects. You can customize the tags using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field on the [model profile](/docs/ai/models/openai#model-profile).

## Outlines

Some local models run through Outlines include in their text output a thinking part delimited by tags. In that case, it will be handled by Pydantic AI that will separate the thinking part from the final answer without the need to specifically enable it. The thinking tags used by default are `"<think>"` and `"</think>"`. If your model uses different tags, you can specify them in the [model profile](/docs/ai/models/openai#model-profile) using the [`thinking_tags`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.thinking_tags) field.

Outlines currently does not support thinking along with structured output. If you provide an `output_type`, the model text output will not contain a thinking part with the associated tags, and you may experience degraded performance.

---

# [fasta2a](https://pydantic.dev/docs/ai/api/fasta2a/)

# fasta2a

### Storage

**Bases:** `ABC`, `Generic[ContextT]`

A storage to retrieve and save tasks, as well as retrieve and save context.

The storage serves two purposes:

1.  Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history
2.  Context storage: Stores conversation context in a format optimized for the specific agent implementation

#### Methods

##### load\_task

`@abstractmethod`

`@async`

```python
def load_task(task_id: str, history_length: int | None = None) -> Task | None
```

Load a task from storage.

If the task is not found, return None.

###### Returns

`Task` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### submit\_task

`@abstractmethod`

`@async`

```python
def submit_task(context_id: str, message: Message) -> Task
```

Submit a task to storage.

###### Returns

`Task`

##### update\_task

`@abstractmethod`

`@async`

```python
def update_task(
    task_id: str,
    state: TaskState,
    new_artifacts: list[Artifact] | None = None,
    new_messages: list[Message] | None = None,
) -> Task
```

Update the state of a task. Appends artifacts and messages, if specified.

###### Returns

`Task`

##### load\_context

`@abstractmethod`

`@async`

```python
def load_context(context_id: str) -> ContextT | None
```

Retrieve the stored context given the `context_id`.

###### Returns

`ContextT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### update\_context

`@abstractmethod`

`@async`

```python
def update_context(context_id: str, context: ContextT) -> None
```

Updates the context for a `context_id`.

Implementing agent can decide what to store in context.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### Broker

**Bases:** `ABC`

The broker class is in charge of scheduling the tasks.

The HTTP server uses the broker to schedule tasks.

The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers.

#### Methods

##### run\_task

`@abstractmethod`

`@async`

```python
def run_task(params: TaskSendParams) -> None
```

Send a task to be executed by the worker.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### cancel\_task

`@abstractmethod`

`@async`

```python
def cancel_task(params: TaskIdParams) -> None
```

Cancel a task.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### receive\_task\_operations

`@abstractmethod`

```python
def receive_task_operations() -> AsyncIterator[TaskOperation]
```

Receive task operations from the broker.

On a multi-worker setup, the broker will need to round-robin the task operations between the workers.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`TaskOperation`\]

### Worker

**Bases:** `ABC`, `Generic[ContextT]`

A worker is responsible for executing tasks.

#### Methods

##### run

`@async`

```python
def run() -> AsyncIterator[None]
```

Run the worker.

It connects to the broker, and it makes itself available to receive commands.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

### FastA2A

**Bases:** `Starlette`

The main class for the FastA2A library.

### Skill

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Skills are a unit of capability that an agent can perform.

#### Attributes

##### id

A unique identifier for the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### name

Human readable name of the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

A human-readable description of the skill.

It will be used by the client or a human as a hint to understand the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### tags

Set of tag-words describing classes of capabilities for this specific skill.

Examples: "cooking", "customer support", "billing".

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### examples

The set of example scenarios that the skill can perform.

Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread")

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]

##### input\_modes

Supported mime types for input data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### output\_modes

Supported mime types for output data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

This module contains the schema for the agent card.

### AgentCard

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

The card that describes an agent.

#### Attributes

##### name

Human readable name of the agent e.g. "Recipe Agent".

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

A human-readable description of the agent.

Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.")

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

A URL to the address the agent is hosted at.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### version

The version of the agent - format is up to the provider. (e.g. "1.0.0")

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### protocol\_version

The version of the A2A protocol this agent supports.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider

The service provider of the agent.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`AgentProvider`\]

##### documentation\_url

A URL to documentation for the agent.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### icon\_url

A URL to an icon for the agent.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### preferred\_transport

The transport of the preferred endpoint. If empty, defaults to JSONRPC.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### additional\_interfaces

Announcement of additional supported transports.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`AgentInterface`\]\]

##### capabilities

The capabilities of the agent.

**Type:** `AgentCapabilities`

##### security

Security requirements for contacting the agent.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]\]\]

##### security\_schemes

Security scheme definitions.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `SecurityScheme`\]\]

##### default\_input\_modes

Supported mime types for input data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### default\_output\_modes

Supported mime types for output data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### skills

The set of skills, or distinct capabilities, that the agent can perform.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Skill`\]

### AgentProvider

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

The service provider of the agent.

#### Attributes

##### organization

The name of the agent provider's organization.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

A URL for the agent provider's website or relevant documentation.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### AgentCapabilities

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

The capabilities of the agent.

#### Attributes

##### streaming

Whether the agent supports streaming.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### push\_notifications

Whether the agent can notify updates to client.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### state\_transition\_history

Whether the agent exposes status change history for tasks.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

### HttpSecurityScheme

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

HTTP security scheme.

#### Attributes

##### type

The type of the security scheme. Must be 'http'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['http'\]

##### scheme

The name of the HTTP Authorization scheme.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### bearer\_format

A hint to the client to identify how the bearer token is formatted.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### description

Description of this security scheme.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### ApiKeySecurityScheme

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

API Key security scheme.

#### Attributes

##### type

The type of the security scheme. Must be 'apiKey'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['apiKey'\]

##### name

The name of the header, query or cookie parameter to be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### in\_

The location of the API key.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['query', 'header', 'cookie'\]

##### description

Description of this security scheme.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### OAuth2SecurityScheme

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

OAuth2 security scheme.

#### Attributes

##### type

The type of the security scheme. Must be 'oauth2'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['oauth2'\]

##### flows

An object containing configuration information for the flow types supported.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### description

Description of this security scheme.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### OpenIdConnectSecurityScheme

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

OpenID Connect security scheme.

#### Attributes

##### type

The type of the security scheme. Must be 'openIdConnect'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openIdConnect'\]

##### open\_id\_connect\_url

OpenId Connect URL to discover OAuth2 configuration values.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

Description of this security scheme.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### AgentInterface

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

An interface that the agent supports.

#### Attributes

##### transport

The transport protocol (e.g., 'jsonrpc', 'websocket').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

The URL endpoint for this transport.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

Description of this interface.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### AgentExtension

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A declaration of an extension supported by an Agent.

#### Attributes

##### uri

The URI of the extension.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

A description of how this agent uses this extension.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### required

Whether the client must follow specific requirements of the extension.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### params

Optional configuration for the extension.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### Skill

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Skills are a unit of capability that an agent can perform.

#### Attributes

##### id

A unique identifier for the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### name

Human readable name of the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### description

A human-readable description of the skill.

It will be used by the client or a human as a hint to understand the skill.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### tags

Set of tag-words describing classes of capabilities for this specific skill.

Examples: "cooking", "customer support", "billing".

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### examples

The set of example scenarios that the skill can perform.

Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread")

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]

##### input\_modes

Supported mime types for input data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### output\_modes

Supported mime types for output data.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### Artifact

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Agents generate Artifacts as an end result of a Task.

Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts.

A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts.

#### Attributes

##### artifact\_id

Unique identifier for the artifact.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### name

The name of the artifact.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### description

A description of the artifact.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### parts

The parts that make up the artifact.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Part`\]

##### metadata

Metadata about the artifact.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

##### extensions

Array of extensions.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]

##### append

Whether to append this artifact to an existing one.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### last\_chunk

Whether this is the last chunk of the artifact.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

### PushNotificationConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for push notifications.

A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task.

The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc.).

For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service).

#### Attributes

##### id

Server-assigned identifier.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### url

The URL to send push notifications to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### token

Token unique to this task/session.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### authentication

Authentication details for push notifications.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`SecurityScheme`\]

### TaskPushNotificationConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for task push notifications.

#### Attributes

##### id

The task id.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### push\_notification\_config

The push notification configuration.

**Type:** `PushNotificationConfig`

### Message

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A Message contains any content that is not an Artifact.

This can include things like agent thoughts, user context, instructions, errors, status, or metadata.

All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts).

A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client.

#### Attributes

##### role

The role of the message.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['user', 'agent'\]

##### parts

The parts of the message.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Part`\]

##### kind

Event type.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['message'\]

##### metadata

Metadata about the message.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

##### message\_id

Identifier created by the message creator.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### context\_id

The context the message is associated with.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### task\_id

Identifier of task the message is related to.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### reference\_task\_ids

Array of task IDs this message references.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]

##### extensions

Array of extensions.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]\]

### TextPart

**Bases:** `_BasePart`

A part that contains text.

#### Attributes

##### kind

The kind of the part.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\]

##### text

The text of the part.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### FileWithBytes

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

File with base64 encoded data.

#### Attributes

##### bytes

The base64 encoded content of the file.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### mime\_type

Optional mime type for the file.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### FileWithUri

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

File with URI reference.

#### Attributes

##### uri

The URI of the file.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### mime\_type

The mime type of the file.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### FilePart

**Bases:** `_BasePart`

A part that contains a file.

#### Attributes

##### kind

The kind of the part.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['file'\]

##### file

The file content - either bytes or URI.

**Type:** `FileWithBytes` | `FileWithUri`

### DataPart

**Bases:** `_BasePart`

A part that contains structured data.

#### Attributes

##### kind

The kind of the part.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['data'\]

##### data

The data of the part.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

### TaskStatus

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Status and accompanying message for a task.

#### Attributes

##### state

The current state of the task.

**Type:** `TaskState`

##### message

Additional status updates for client.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`Message`\]

##### timestamp

ISO datetime value of when the status was updated.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### Task

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome.

Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent.

#### Attributes

##### id

Unique identifier for the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### context\_id

The context the task is associated with.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Event type.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['task'\]

##### status

Current status of the task.

**Type:** `TaskStatus`

##### history

Optional history of messages.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Message`\]\]

##### artifacts

Collection of artifacts created by the agent.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Artifact`\]\]

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### TaskStatusUpdateEvent

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Sent by server during message/stream requests.

#### Attributes

##### task\_id

The id of the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### context\_id

The context the task is associated with.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Event type.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['status-update'\]

##### status

The status of the task.

**Type:** `TaskStatus`

##### final

Indicates the end of the event stream.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### TaskArtifactUpdateEvent

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Sent by server during message/stream requests.

#### Attributes

##### task\_id

The id of the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### context\_id

The context the task is associated with.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Event type identification.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['artifact-update'\]

##### artifact

The artifact that was updated.

**Type:** `Artifact`

##### append

Whether to append to existing artifact (true) or replace (false).

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### last\_chunk

Indicates this is the final chunk of the artifact.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### TaskIdParams

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Parameters for a task id.

#### Attributes

##### id

The unique identifier for the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### metadata

Optional metadata associated with the request.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### TaskQueryParams

**Bases:** `TaskIdParams`

Query parameters for a task.

#### Attributes

##### history\_length

Number of recent messages to be retrieved.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\]

### MessageSendConfiguration

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for the send message request.

#### Attributes

##### accepted\_output\_modes

Accepted output modalities by the client.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### blocking

If the server should treat the client as a blocking request.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`bool`](https://docs.python.org/3/library/functions.html#bool)\]

##### history\_length

Number of recent messages to be retrieved.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\]

##### push\_notification\_config

Where the server should send notifications when disconnected.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`PushNotificationConfig`\]

### MessageSendParams

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Parameters for message/send method.

#### Attributes

##### configuration

Send message configuration.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[`MessageSendConfiguration`\]

##### message

The message being sent to the server.

**Type:** `Message`

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### TaskSendParams

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Internal parameters for task execution within the framework.

Note: This is not part of the A2A protocol - it's used internally for broker/worker communication.

#### Attributes

##### id

The id of the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### context\_id

The context id for the task.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### message

The message to process.

**Type:** `Message`

##### history\_length

Number of recent messages to be retrieved.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`int`](https://docs.python.org/3/library/functions.html#int)\]

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### ListTaskPushNotificationConfigParams

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Parameters for getting list of pushNotificationConfigurations associated with a Task.

#### Attributes

##### id

Task id.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### DeleteTaskPushNotificationConfigParams

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Parameters for removing pushNotificationConfiguration associated with a Task.

#### Attributes

##### id

Task id.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### push\_notification\_config\_id

The push notification config id to delete.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### metadata

Extension metadata.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### JSONRPCMessage

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A JSON RPC message.

#### Attributes

##### jsonrpc

The JSON RPC version.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['2.0'\]

##### id

The request id.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

### JSONRPCRequest

**Bases:** `JSONRPCMessage`, `Generic[Method, Params]`

A JSON RPC request.

#### Attributes

##### method

The method to call.

**Type:** `Method`

##### params

The parameters to pass to the method.

**Type:** `Params`

### JSONRPCError

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict), `Generic[CodeT, MessageT]`

A JSON RPC error.

#### Attributes

##### code

A number that indicates the error type that occurred.

**Type:** `CodeT`

##### message

A string providing a short description of the error.

**Type:** `MessageT`

##### data

A primitive or structured value containing additional information about the error.

**Type:** [`NotRequired`](https://docs.python.org/3/library/typing.html#typing.NotRequired)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

### JSONRPCResponse

**Bases:** `JSONRPCMessage`, `Generic[ResultT, ErrorT]`

A JSON RPC response.

### SecurityScheme

A security scheme for authentication.

**Default:** `Annotated[Union[HttpSecurityScheme, ApiKeySecurityScheme, OAuth2SecurityScheme, OpenIdConnectSecurityScheme], pydantic.Field(discriminator='type')]`

### Part

A fully formed piece of content exchanged between a client and a remote agent as part of a Message or an Artifact.

Each Part has its own content type and metadata.

**Default:** `Annotated[Union[TextPart, FilePart, DataPart], pydantic.Field(discriminator='kind')]`

### TaskState

The possible states of a task.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['submitted', 'working', 'input-required', 'completed', 'canceled', 'failed', 'rejected', 'auth-required', 'unknown']`

### JSONParseError

A JSON RPC error for a parse error.

**Default:** `JSONRPCError[Literal[-32700], Literal['Invalid JSON payload']]`

### InvalidRequestError

A JSON RPC error for an invalid request.

**Default:** `JSONRPCError[Literal[-32600], Literal['Request payload validation error']]`

### MethodNotFoundError

A JSON RPC error for a method not found.

**Default:** `JSONRPCError[Literal[-32601], Literal['Method not found']]`

### InvalidParamsError

A JSON RPC error for invalid parameters.

**Default:** `JSONRPCError[Literal[-32602], Literal['Invalid parameters']]`

### InternalError

A JSON RPC error for an internal error.

**Default:** `JSONRPCError[Literal[-32603], Literal['Internal error']]`

### TaskNotFoundError

A JSON RPC error for a task not found.

**Default:** `JSONRPCError[Literal[-32001], Literal['Task not found']]`

### TaskNotCancelableError

A JSON RPC error for a task not cancelable.

**Default:** `JSONRPCError[Literal[-32002], Literal['Task not cancelable']]`

### PushNotificationNotSupportedError

A JSON RPC error for a push notification not supported.

**Default:** `JSONRPCError[Literal[-32003], Literal['Push notification not supported']]`

### UnsupportedOperationError

A JSON RPC error for an unsupported operation.

**Default:** `JSONRPCError[Literal[-32004], Literal['This operation is not supported']]`

### ContentTypeNotSupportedError

A JSON RPC error for incompatible content types.

**Default:** `JSONRPCError[Literal[-32005], Literal['Incompatible content types']]`

### InvalidAgentResponseError

A JSON RPC error for invalid agent response.

**Default:** `JSONRPCError[Literal[-32006], Literal['Invalid agent response']]`

### SendMessageRequest

A JSON RPC request to send a message.

**Default:** `JSONRPCRequest[Literal['message/send'], MessageSendParams]`

### SendMessageResponse

A JSON RPC response to send a message.

**Default:** `JSONRPCResponse[Union[Task, Message], JSONRPCError[Any, Any]]`

### StreamMessageRequest

A JSON RPC request to stream a message.

**Default:** `JSONRPCRequest[Literal['message/stream'], MessageSendParams]`

### StreamMessageResponse

A JSON RPC response to a StreamMessageRequest.

**Default:** `JSONRPCResponse[Union[Task, Message, TaskStatusUpdateEvent, TaskArtifactUpdateEvent], JSONRPCError[Any, Any]]`

### GetTaskRequest

A JSON RPC request to get a task.

**Default:** `JSONRPCRequest[Literal['tasks/get'], TaskQueryParams]`

### GetTaskResponse

A JSON RPC response to get a task.

**Default:** `JSONRPCResponse[Task, TaskNotFoundError]`

### CancelTaskRequest

A JSON RPC request to cancel a task.

**Default:** `JSONRPCRequest[Literal['tasks/cancel'], TaskIdParams]`

### CancelTaskResponse

A JSON RPC response to cancel a task.

**Default:** `JSONRPCResponse[Task, Union[TaskNotCancelableError, TaskNotFoundError]]`

### SetTaskPushNotificationRequest

A JSON RPC request to set a task push notification.

**Default:** `JSONRPCRequest[Literal['tasks/pushNotification/set'], TaskPushNotificationConfig]`

### SetTaskPushNotificationResponse

A JSON RPC response to set a task push notification.

**Default:** `JSONRPCResponse[TaskPushNotificationConfig, PushNotificationNotSupportedError]`

### GetTaskPushNotificationRequest

A JSON RPC request to get a task push notification.

**Default:** `JSONRPCRequest[Literal['tasks/pushNotification/get'], TaskIdParams]`

### GetTaskPushNotificationResponse

A JSON RPC response to get a task push notification.

**Default:** `JSONRPCResponse[TaskPushNotificationConfig, PushNotificationNotSupportedError]`

### ResubscribeTaskRequest

A JSON RPC request to resubscribe to a task.

**Default:** `JSONRPCRequest[Literal['tasks/resubscribe'], TaskIdParams]`

### ListTaskPushNotificationConfigRequest

A JSON RPC request to list task push notification configs.

**Default:** `JSONRPCRequest[Literal['tasks/pushNotificationConfig/list'], ListTaskPushNotificationConfigParams]`

### DeleteTaskPushNotificationConfigRequest

A JSON RPC request to delete a task push notification config.

**Default:** `JSONRPCRequest[Literal['tasks/pushNotificationConfig/delete'], DeleteTaskPushNotificationConfigParams]`

### A2ARequest

A JSON RPC request to the A2A server.

**Default:** `Annotated[Union[SendMessageRequest, StreamMessageRequest, GetTaskRequest, CancelTaskRequest, SetTaskPushNotificationRequest, GetTaskPushNotificationRequest, ResubscribeTaskRequest, ListTaskPushNotificationConfigRequest, DeleteTaskPushNotificationConfigRequest], Discriminator('method')]`

### A2AResponse

A JSON RPC response from the A2A server.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Union[SendMessageResponse, StreamMessageResponse, GetTaskResponse, CancelTaskResponse, SetTaskPushNotificationResponse, GetTaskPushNotificationResponse]`

### A2AClient

A client for the A2A protocol.

#### Methods

##### send\_message

`@async`

```python
def send_message(
    message: Message,
    metadata: dict[str, Any] | None = None,
    configuration: MessageSendConfiguration | None = None,
) -> SendMessageResponse
```

Send a message using the A2A protocol.

Returns a JSON-RPC response containing either a result (Task) or an error.

###### Returns

`SendMessageResponse`

### UnexpectedResponseError

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

An error raised when an unexpected response is received from the server.

---

# [pydantic_ai.models.anthropic](https://pydantic.dev/docs/ai/api/models/anthropic/)

# pydantic\_ai.models.anthropic

## Setup

For details on how to set up authentication with this model, see [model configuration for Anthropic](/docs/ai/models/anthropic).

### AnthropicModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for an Anthropic model request.

#### Attributes

##### anthropic\_metadata

An object describing metadata about the request.

Contains `user_id`, an external identifier for the user who is associated with the request.

**Type:** `BetaMetadataParam`

##### anthropic\_thinking

Determine whether the model should generate a thinking block.

See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information.

**Type:** `BetaThinkingConfigParam`

##### anthropic\_cache\_tool\_definitions

Whether to add `cache_control` to the last tool definition.

When enabled, the last tool in the `tools` array will have `cache_control` set, allowing Anthropic to cache tool definitions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### anthropic\_service\_tier

The service tier to use for the model request.

See [https://docs.anthropic.com/en/docs/build-with-claude/latency-and-throughput](https://docs.anthropic.com/en/docs/build-with-claude/latency-and-throughput) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'standard\_only'\]

##### anthropic\_cache\_instructions

Whether to add `cache_control` to the last system prompt block.

When enabled, the last system prompt will have `cache_control` set, allowing Anthropic to cache system instructions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### anthropic\_cache\_messages

Whether to add `cache_control` to the last message content block.

This is an alternative to `anthropic_cache` for Anthropic-compatible gateways and proxies that accept the Anthropic message format but don't support the top-level automatic caching parameter.

If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. Cannot be combined with `anthropic_cache`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### anthropic\_cache

Enable prompt caching for multi-turn conversations.

Passes a top-level `cache_control` parameter so the server automatically applies a cache breakpoint to the last cacheable block and moves it forward as conversations grow.

On Bedrock and Vertex, automatic caching is not yet supported, so this falls back to per-block caching on the last user message. If the last content block already has `cache_control` from an explicit `CachePoint`, it is preserved.

If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly.

This can be combined with explicit cache breakpoints (`anthropic_cache_instructions`, `anthropic_cache_tool_definitions`, `CachePoint`). The automatic breakpoint counts as 1 of Anthropic's 4 cache point slots; we automatically trim excess explicit breakpoints. See [https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### anthropic\_effort

The effort level for the model to use when generating a response.

See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'xhigh', 'max'\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### anthropic\_task\_budget

Task budget configuration for Claude Opus 4.7 beta requests.

Maps to `output_config.task_budget`. This setting is currently only supported on `claude-opus-4-7`, and Pydantic AI automatically enables Anthropic's required task-budget beta when it is present.

Omit `remaining` unless you are intentionally carrying a budget across compaction or other rewritten context.

**Type:** `AnthropicTaskBudget`

##### anthropic\_container

Container configuration for multi-turn conversations.

By default, if previous messages contain a container\_id (from a prior response), it will be reused automatically.

Set to `False` to force a fresh container (ignore any `container_id` from history). Set to a container id string (e.g. `'container_xxx'`) to explicitly reuse a container, or to a `BetaContainerParams` dict (e.g. `{'skills': [...]}` or `{'id': 'container_xxx', 'skills': [...]}`) when passing Skills to the Anthropic Skills beta.

**Type:** `BetaContainerParams` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\]

##### anthropic\_code\_execution\_tool\_version

Which Anthropic code execution tool version to send for `CodeExecutionTool`.

Defaults to `'auto'`, which uses the default version from the model profile: `'20260120'` for Sonnet 4.5+ and Opus 4.5+, otherwise `'20250825'`. Set a concrete version to force that tool version; a `UserError` is raised if the selected model profile does not support that version.

**Type:** `AnthropicCodeExecutionToolVersion` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto'\]

##### anthropic\_eager\_input\_streaming

Whether to enable eager input streaming on tool definitions.

When enabled, all tool definitions will have `eager_input_streaming` set to `True`, allowing Anthropic to stream tool call arguments incrementally instead of buffering the entire JSON before streaming. This reduces latency for tool calls with large inputs. See [https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming](https://platform.claude.com/docs/en/agents-and-tools/tool-use/fine-grained-tool-streaming) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### anthropic\_betas

List of Anthropic beta features to enable for API requests.

Each item can be a known beta name (e.g. 'interleaved-thinking-2025-05-14') or a custom string. Merged with auto-added betas (e.g. builtin tools) and any betas from extra\_headers\['anthropic-beta'\]. See the Anthropic docs for available beta features.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AnthropicBetaParam`\]

##### anthropic\_speed

The inference speed mode for this request.

`'fast'` enables high output-tokens-per-second inference for supported models (currently Claude Opus 4.6 only). On unsupported models or clients, `anthropic_speed='fast'` is ignored with a `UserWarning`. Fast mode is a research preview and only available on the direct Anthropic API (not Bedrock, Vertex, or Foundry); see [the Anthropic docs](https://platform.claude.com/docs/en/build-with-claude/fast-mode) for details. Note: switching between `'fast'` and `'standard'` invalidates the prompt cache.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['standard', 'fast'\]

##### anthropic\_context\_management

Context management configuration for automatic compaction.

When configured, Anthropic will automatically compact older context when the input token count exceeds the configured threshold. The compaction produces a summary that replaces the compacted messages.

See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/compaction) for more details.

**Type:** `BetaContextManagementConfigParam`

### AnthropicModel

**Bases:** `Model[AsyncAnthropicClient]`

A model that uses the Anthropic API.

Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** `AnthropicModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: AnthropicModelName,
    provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an Anthropic model.

###### Parameters

**`model_name`** : `AnthropicModelName`

The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['anthropic', 'gateway'\] | `Provider`\[`AsyncAnthropicClient`\] _Default:_ `'anthropic'`

The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. Defaults to 'anthropic'.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name. The default 'anthropic' provider will use the default `..profiles.anthropic.anthropic_model_profile`.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

The set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### AnthropicCompaction

**Bases:** `AbstractCapability[AgentDepsT]`

Compaction capability for Anthropic models.

Configures automatic context management via Anthropic's `context_management` API parameter. Compaction triggers server-side when input tokens exceed the configured threshold.

Example usage::

from pydantic\_ai import Agent from pydantic\_ai.models.anthropic import AnthropicCompaction

agent = Agent( 'anthropic:claude-sonnet-4-6', capabilities=\[AnthropicCompaction(token\_threshold=100\_000)\], )

#### Methods

##### \_\_init\_\_

```python
def __init__(
    token_threshold: int = 150000,
    instructions: str | None = None,
    pause_after_compaction: bool = False,
) -> None
```

Initialize the Anthropic compaction capability.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`token_threshold`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `150000`

Compact when input tokens exceed this threshold. Minimum 50,000.

**`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom instructions for the compaction summarization.

**`pause_after_compaction`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If `True`, the response will stop after the compaction block with `stop_reason='compaction'`, allowing explicit handling.

### AnthropicStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Anthropic models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `AnthropicModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### LatestAnthropicModelNames

Anthropic model names from the installed SDK.

**Default:** `ModelParam`

### AnthropicModelName

Possible Anthropic model names.

The installed Anthropic SDK exposes the current literal set and still allows arbitrary string model names. See [the Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list.

**Default:** `LatestAnthropicModelNames`

### AnthropicTaskBudget

Anthropic task budget payload for `output_config.task_budget`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `BetaTokenTaskBudgetParam`

---

# [pydantic_ai.models](https://pydantic.dev/docs/ai/api/models/base/)

# pydantic\_ai.models

Logic related to making requests to an LLM.

The aim here is to make a common interface for different LLMs, so that the rest of the code can be agnostic to the specific LLM being used.

### ModelRequestParameters

Configuration for an agent's request to a model, specifically related to tools and output handling.

#### Attributes

##### instruction\_parts

Structured instruction parts with metadata about their origin (static vs dynamic).

Static instructions (`dynamic=False`) come from literal strings passed to `Agent(instructions=...)`. Dynamic instructions (`dynamic=True`) come from `@agent.instructions` functions, `TemplateStr`, or toolset `get_instructions()` methods.

Models that support granular caching (e.g. Anthropic, Bedrock) use this to place cache boundaries at the static/dynamic instruction boundary.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### thinking

Resolved thinking/reasoning configuration for this request.

`None` means the model should use its default behavior. Set by the base `Model.prepare_request()` from the unified `thinking` field in `ModelSettings`, after checking that the model's profile supports thinking.

**Type:** `ThinkingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### with\_default\_output\_mode

```python
def with_default_output_mode(
    output_mode: StructuredOutputMode,
) -> ModelRequestParameters
```

Set the default output mode if the current mode is 'auto', atomically updating allow\_text\_output.

No-op if the current output\_mode is not 'auto'. This ensures the two fields stay in sync -- output\_mode='tool' implies allow\_text\_output=False, while 'native' and 'prompted' imply allow\_text\_output=True.

###### Returns

`ModelRequestParameters`

### Model

**Bases:** `ABC`, `Generic[InterfaceClient]`

Abstract class for a model.

#### Attributes

##### provider

The provider for this model, if any.

**Type:** `Provider`\[`InterfaceClient`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### settings

Get the model settings.

**Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_id

The fully qualified model name in `'provider:model_name'` format.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### label

Human-friendly display label for the model.

Handles common patterns:

-   gpt-5 -> GPT 5
-   claude-sonnet-4-5 -> Claude Sonnet 4.5
-   gemini-2.5-pro -> Gemini 2.5 Pro
-   meta-llama/llama-3-70b -> Llama 3 70b (OpenRouter style)

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### profile

The model profile.

We use this to compute the intersection of the profile's supported\_builtin\_tools and the model's implemented tools, ensuring model.profile.supported\_builtin\_tools is the single source of truth for what builtin tools are actually usable.

**Type:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

##### system

The model provider, ex: openai.

Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in [https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system](https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system) when applicable.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### base\_url

The base URL for the provider API, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    settings: ModelSettings | None = None,
    profile: ModelProfileSpec | None = None,
) -> None
```

Initialize the model with optional settings and profile.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use.

##### \_\_aenter\_\_

`@async`

```python
def __aenter__() -> Self
```

Enter the model context, delegating to the provider to manage its HTTP client lifecycle.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

##### \_\_aexit\_\_

`@async`

```python
def __aexit__(
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: TracebackType | None,
) -> bool | None
```

Exit the model context, closing the provider's HTTP client if it owns one.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### request

`@abstractmethod`

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Make a request to the model.

This is ultimately called by `pydantic_ai._agent_graph.ModelRequestNode._make_request(...)`.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### count\_tokens

`@async`

```python
def count_tokens(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> RequestUsage
```

Make a request to the model for counting tokens.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

##### compact\_messages

`@async`

```python
def compact_messages(
    request_context: ModelRequestContext,
    instructions: str | None = None,
) -> ModelResponse
```

Compact messages to reduce conversation context size.

This method is optional and only supported by specific providers (e.g. OpenAI Responses API). Providers that support compaction override this method with their implementation.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Make a request to the model and return a streaming response.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

##### customize\_request\_parameters

```python
def customize_request_parameters(
    model_request_parameters: ModelRequestParameters,
) -> ModelRequestParameters
```

Customize the request parameters for the model.

This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons.

###### Returns

`ModelRequestParameters`

##### prepare\_request

```python
def prepare_request(
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> tuple[ModelSettings | None, ModelRequestParameters]
```

Prepare request inputs before they are passed to the provider.

This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved [`ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters). Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods.

###### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None), `ModelRequestParameters`\]

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model class can handle.

Subclasses should override this to reflect their actual capabilities. Default is empty set - subclasses must explicitly declare support.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### StreamedResponse

**Bases:** `ABC`

Streamed response from an LLM when calling a tool.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

##### cancelled

Whether the stream has been cancelled via `cancel()`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### \_\_aiter\_\_

```python
def __aiter__() -> AsyncIterator[ModelResponseStreamEvent]
```

Stream the response as an async iterable of [`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)s.

This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent) if/when the first match is found.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)\]

##### cancel

`@async`

```python
def cancel() -> None
```

Cancel the stream, stopping token generation.

Sets `self._cancelled = True` before delegating to `close_stream()` so the flag is visible to any iterator that observes the transport error raised when the underlying connection is torn down, even if `close_stream()` itself raises.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_stream\_cancel\_errors

```python
def get_stream_cancel_errors() -> tuple[type[BaseException], ...]
```

Return transport errors caused by `cancel()` tearing down the stream.

The default covers model classes whose SDKs iterate `httpx` responses directly (Anthropic, OpenAI, Groq, Mistral, Google GenAI, HuggingFace, and the custom Gemini client), since they let bare `httpx` errors propagate from chunk reads. Model classes that use other transports (for example gRPC or botocore) should override this method.

###### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[[`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException)\], ...\]

##### close\_stream

`@async`

```python
def close_stream() -> None
```

Close the underlying HTTP/gRPC connection.

Model classes must override this to stop token generation (and billing) on the remote side. Integrations that cannot support cancellation should leave the default implementation so `cancel()` fails clearly rather than silently reporting successful cancellation while generation continues.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### get

```python
def get() -> ModelResponse
```

Build a [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) from the data received from the stream so far.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### usage

```python
def usage() -> RequestUsage
```

Get the usage of the response so far. This will not be the final usage until the stream is exhausted.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

### check\_allow\_model\_requests

```python
def check_allow_model_requests() -> None
```

Check if model requests are allowed.

If you're defining your own models that have costs or latency associated with their use, you should call this in [`Model.request`](/docs/ai/api/models/base/#pydantic_ai.models.Model.request) and [`Model.request_stream`](/docs/ai/api/models/base/#pydantic_ai.models.Model.request_stream).

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Raises

-   `RuntimeError` -- If model requests are not allowed.

### override\_allow\_model\_requests

```python
def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None]
```

Context manager to temporarily override [`ALLOW_MODEL_REQUESTS`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS).

#### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

#### Parameters

**`allow_model_requests`** : [`bool`](https://docs.python.org/3/library/functions.html#bool)

Whether to allow model requests within the context.

### KnownModelName

Known model names that can be used with the `model` parameter of [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent).

`KnownModelName` is provided as a concise way to specify a model.

**Default:** `TypeAliasType('KnownModelName', Literal['anthropic:claude-3-haiku-20240307', 'anthropic:claude-haiku-4-5-20251001', 'anthropic:claude-mythos-preview', 'anthropic:claude-haiku-4-5', 'anthropic:claude-opus-4-0', 'anthropic:claude-opus-4-1', 'anthropic:claude-opus-4-1-20250805', 'anthropic:claude-opus-4-20250514', 'anthropic:claude-opus-4-5-20251101', 'anthropic:claude-opus-4-5', 'anthropic:claude-opus-4-6', 'anthropic:claude-opus-4-7', 'anthropic:claude-sonnet-4-0', 'anthropic:claude-sonnet-4-20250514', 'anthropic:claude-sonnet-4-5-20250929', 'anthropic:claude-sonnet-4-5', 'anthropic:claude-sonnet-4-6', 'bedrock:amazon.titan-text-express-v1', 'bedrock:amazon.titan-text-lite-v1', 'bedrock:amazon.titan-tg1-large', 'bedrock:anthropic.claude-3-5-haiku-20241022-v1:0', 'bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0', 'bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0', 'bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0', 'bedrock:anthropic.claude-3-haiku-20240307-v1:0', 'bedrock:anthropic.claude-3-opus-20240229-v1:0', 'bedrock:anthropic.claude-3-sonnet-20240229-v1:0', 'bedrock:anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:anthropic.claude-instant-v1', 'bedrock:anthropic.claude-opus-4-20250514-v1:0', 'bedrock:anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:anthropic.claude-sonnet-4-6', 'bedrock:anthropic.claude-v2:1', 'bedrock:anthropic.claude-v2', 'bedrock:cohere.command-light-text-v14', 'bedrock:cohere.command-r-plus-v1:0', 'bedrock:cohere.command-r-v1:0', 'bedrock:cohere.command-text-v14', 'bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:eu.anthropic.claude-sonnet-4-6', 'bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0', 'bedrock:meta.llama3-1-405b-instruct-v1:0', 'bedrock:meta.llama3-1-70b-instruct-v1:0', 'bedrock:meta.llama3-1-8b-instruct-v1:0', 'bedrock:meta.llama3-70b-instruct-v1:0', 'bedrock:meta.llama3-8b-instruct-v1:0', 'bedrock:mistral.mistral-7b-instruct-v0:2', 'bedrock:mistral.mistral-large-2402-v1:0', 'bedrock:mistral.mistral-large-2407-v1:0', 'bedrock:mistral.mixtral-8x7b-instruct-v0:1', 'bedrock:us.amazon.nova-2-lite-v1:0', 'bedrock:us.amazon.nova-lite-v1:0', 'bedrock:us.amazon.nova-micro-v1:0', 'bedrock:us.amazon.nova-pro-v1:0', 'bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0', 'bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0', 'bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0', 'bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0', 'bedrock:us.anthropic.claude-3-haiku-20240307-v1:0', 'bedrock:us.anthropic.claude-3-opus-20240229-v1:0', 'bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0', 'bedrock:us.anthropic.claude-haiku-4-5-20251001-v1:0', 'bedrock:us.anthropic.claude-opus-4-20250514-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', 'bedrock:us.anthropic.claude-sonnet-4-6', 'bedrock:us.meta.llama3-1-70b-instruct-v1:0', 'bedrock:us.meta.llama3-1-8b-instruct-v1:0', 'bedrock:us.meta.llama3-2-11b-instruct-v1:0', 'bedrock:us.meta.llama3-2-1b-instruct-v1:0', 'bedrock:us.meta.llama3-2-3b-instruct-v1:0', 'bedrock:us.meta.llama3-2-90b-instruct-v1:0', 'bedrock:us.meta.llama3-3-70b-instruct-v1:0', 'cerebras:gpt-oss-120b', 'cerebras:llama3.1-8b', 'cerebras:qwen-3-235b-a22b-instruct-2507', 'cerebras:zai-glm-4.7', 'cohere:c4ai-aya-expanse-32b', 'cohere:c4ai-aya-expanse-8b', 'cohere:command-nightly', 'cohere:command-r-08-2024', 'cohere:command-r-plus-08-2024', 'cohere:command-r7b-12-2024', 'deepseek:deepseek-chat', 'deepseek:deepseek-reasoner', 'deepseek:deepseek-v4-flash', 'deepseek:deepseek-v4-pro', 'gateway/anthropic:claude-3-haiku-20240307', 'gateway/anthropic:claude-haiku-4-5-20251001', 'gateway/anthropic:claude-mythos-preview', 'gateway/anthropic:claude-haiku-4-5', 'gateway/anthropic:claude-opus-4-0', 'gateway/anthropic:claude-opus-4-1', 'gateway/anthropic:claude-opus-4-1-20250805', 'gateway/anthropic:claude-opus-4-20250514', 'gateway/anthropic:claude-opus-4-5-20251101', 'gateway/anthropic:claude-opus-4-5', 'gateway/anthropic:claude-opus-4-6', 'gateway/anthropic:claude-opus-4-7', 'gateway/anthropic:claude-sonnet-4-0', 'gateway/anthropic:claude-sonnet-4-20250514', 'gateway/anthropic:claude-sonnet-4-5-20250929', 'gateway/anthropic:claude-sonnet-4-5', 'gateway/anthropic:claude-sonnet-4-6', 'gateway/bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0', 'gateway/bedrock:anthropic.claude-3-haiku-20240307-v1:0', 'gateway/bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'gateway/bedrock:eu.anthropic.claude-sonnet-4-6', 'gateway/bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0', 'gateway/google-vertex:gemini-2.5-flash-image', 'gateway/google-vertex:gemini-2.5-flash-lite-preview-09-2025', 'gateway/google-vertex:gemini-2.5-flash-lite', 'gateway/google-vertex:gemini-2.5-flash', 'gateway/google-vertex:gemini-2.5-pro', 'gateway/google-vertex:gemini-3-flash-preview', 'gateway/google-vertex:gemini-3-pro-image-preview', 'gateway/google-vertex:gemini-3.1-flash-image-preview', 'gateway/google-vertex:gemini-3.1-flash-lite-preview', 'gateway/google-vertex:gemini-3.1-pro-preview', 'gateway/groq:llama-3.1-8b-instant', 'gateway/groq:llama-3.3-70b-versatile', 'gateway/groq:meta-llama/llama-4-scout-17b-16e-instruct', 'gateway/groq:moonshotai/kimi-k2-instruct-0905', 'gateway/groq:openai/gpt-oss-120b', 'gateway/groq:openai/gpt-oss-20b', 'gateway/groq:openai/gpt-oss-safeguard-20b', 'gateway/openai:gpt-3.5-turbo-0125', 'gateway/openai:gpt-3.5-turbo-1106', 'gateway/openai:gpt-3.5-turbo-16k', 'gateway/openai:gpt-3.5-turbo', 'gateway/openai:gpt-4-0613', 'gateway/openai:gpt-4-turbo-2024-04-09', 'gateway/openai:gpt-4-turbo', 'gateway/openai:gpt-4.1-2025-04-14', 'gateway/openai:gpt-4.1-mini-2025-04-14', 'gateway/openai:gpt-4.1-mini', 'gateway/openai:gpt-4.1-nano-2025-04-14', 'gateway/openai:gpt-4.1-nano', 'gateway/openai:gpt-4.1', 'gateway/openai:gpt-4', 'gateway/openai:gpt-4o-2024-05-13', 'gateway/openai:gpt-4o-2024-08-06', 'gateway/openai:gpt-4o-2024-11-20', 'gateway/openai:gpt-4o-mini-2024-07-18', 'gateway/openai:gpt-4o-mini-search-preview-2025-03-11', 'gateway/openai:gpt-4o-mini-search-preview', 'gateway/openai:gpt-4o-mini', 'gateway/openai:gpt-4o-search-preview-2025-03-11', 'gateway/openai:gpt-4o-search-preview', 'gateway/openai:gpt-4o', 'gateway/openai:gpt-5-2025-08-07', 'gateway/openai:gpt-5-chat-latest', 'gateway/openai:gpt-5-mini-2025-08-07', 'gateway/openai:gpt-5-mini', 'gateway/openai:gpt-5-nano-2025-08-07', 'gateway/openai:gpt-5-nano', 'gateway/openai:gpt-5.1-2025-11-13', 'gateway/openai:gpt-5.1-chat-latest', 'gateway/openai:gpt-5.1', 'gateway/openai:gpt-5.2-2025-12-11', 'gateway/openai:gpt-5.2-chat-latest', 'gateway/openai:gpt-5.2', 'gateway/openai:gpt-5.4-mini-2026-03-17', 'gateway/openai:gpt-5.4-mini', 'gateway/openai:gpt-5.4-nano-2026-03-17', 'gateway/openai:gpt-5.4-nano', 'gateway/openai:gpt-5.4', 'gateway/openai:gpt-5', 'gateway/openai:o1-2024-12-17', 'gateway/openai:o1', 'gateway/openai:o3-2025-04-16', 'gateway/openai:o3-mini-2025-01-31', 'gateway/openai:o3-mini', 'gateway/openai:o3', 'gateway/openai:o4-mini-2025-04-16', 'gateway/openai:o4-mini', 'google-gla:gemini-2.0-flash-lite', 'google-gla:gemini-2.0-flash', 'google-gla:gemini-2.5-flash-image', 'google-gla:gemini-2.5-flash-lite-preview-09-2025', 'google-gla:gemini-2.5-flash-lite', 'google-gla:gemini-2.5-flash-preview-09-2025', 'google-gla:gemini-2.5-flash', 'google-gla:gemini-2.5-pro', 'google-gla:gemini-3-flash-preview', 'google-gla:gemini-3-pro-image-preview', 'google-gla:gemini-3-pro-preview', 'google-gla:gemini-3.1-flash-image-preview', 'google-gla:gemini-3.1-flash-lite-preview', 'google-gla:gemini-3.1-pro-preview', 'google-gla:gemini-flash-latest', 'google-gla:gemini-flash-lite-latest', 'google-vertex:gemini-2.0-flash-lite', 'google-vertex:gemini-2.0-flash', 'google-vertex:gemini-2.5-flash-image', 'google-vertex:gemini-2.5-flash-lite-preview-09-2025', 'google-vertex:gemini-2.5-flash-lite', 'google-vertex:gemini-2.5-flash-preview-09-2025', 'google-vertex:gemini-2.5-flash', 'google-vertex:gemini-2.5-pro', 'google-vertex:gemini-3-flash-preview', 'google-vertex:gemini-3-pro-image-preview', 'google-vertex:gemini-3-pro-preview', 'google-vertex:gemini-3.1-flash-image-preview', 'google-vertex:gemini-3.1-flash-lite-preview', 'google-vertex:gemini-3.1-pro-preview', 'google-vertex:gemini-flash-latest', 'google-vertex:gemini-flash-lite-latest', 'grok:grok-2-image-1212', 'grok:grok-2-vision-1212', 'grok:grok-3-fast', 'grok:grok-3-mini-fast', 'grok:grok-3-mini', 'grok:grok-3', 'grok:grok-4-0709', 'grok:grok-4-latest', 'grok:grok-4-1-fast-non-reasoning', 'grok:grok-4-1-fast-reasoning', 'grok:grok-4-1-fast', 'grok:grok-4-fast-non-reasoning', 'grok:grok-4-fast-reasoning', 'grok:grok-4-fast', 'grok:grok-4', 'grok:grok-code-fast-1', 'xai:grok-3', 'xai:grok-3-fast', 'xai:grok-3-fast-latest', 'xai:grok-3-latest', 'xai:grok-3-mini', 'xai:grok-3-mini-fast', 'xai:grok-3-mini-fast-latest', 'xai:grok-4', 'xai:grok-4-0709', 'xai:grok-4-1-fast', 'xai:grok-4-1-fast-non-reasoning', 'xai:grok-4-1-fast-non-reasoning-latest', 'xai:grok-4-1-fast-reasoning', 'xai:grok-4-1-fast-reasoning-latest', 'xai:grok-4-fast', 'xai:grok-4-fast-non-reasoning', 'xai:grok-4-fast-non-reasoning-latest', 'xai:grok-4-fast-reasoning', 'xai:grok-4-fast-reasoning-latest', 'xai:grok-4-latest', 'xai:grok-code-fast-1', 'groq:llama-3.1-8b-instant', 'groq:llama-3.3-70b-versatile', 'groq:meta-llama/llama-guard-4-12b', 'groq:openai/gpt-oss-120b', 'groq:openai/gpt-oss-20b', 'groq:whisper-large-v3', 'groq:whisper-large-v3-turbo', 'groq:meta-llama/llama-4-maverick-17b-128e-instruct', 'groq:meta-llama/llama-4-scout-17b-16e-instruct', 'groq:meta-llama/llama-prompt-guard-2-22m', 'groq:meta-llama/llama-prompt-guard-2-86m', 'groq:moonshotai/kimi-k2-instruct-0905', 'groq:openai/gpt-oss-safeguard-20b', 'groq:playai-tts', 'groq:playai-tts-arabic', 'groq:qwen/qwen-3-32b', 'heroku:claude-3-5-haiku', 'heroku:claude-3-5-sonnet-latest', 'heroku:claude-3-7-sonnet', 'heroku:claude-3-haiku', 'heroku:claude-4-5-haiku', 'heroku:claude-4-5-sonnet', 'heroku:claude-4-6-sonnet', 'heroku:claude-4-sonnet', 'heroku:claude-opus-4-5', 'heroku:claude-opus-4-6', 'heroku:deepseek-v3-2', 'heroku:glm-4-7', 'heroku:glm-4-7-flash', 'heroku:gpt-oss-120b', 'heroku:kimi-k2-5', 'heroku:kimi-k2-thinking', 'heroku:minimax-m2', 'heroku:minimax-m2-1', 'heroku:qwen3-235b', 'heroku:qwen3-coder-480b', 'heroku:nova-2-lite', 'heroku:nova-lite', 'heroku:nova-pro', 'huggingface:deepseek-ai/DeepSeek-R1', 'huggingface:meta-llama/Llama-3.3-70B-Instruct', 'huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct', 'huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct', 'huggingface:Qwen/Qwen2.5-72B-Instruct', 'huggingface:Qwen/Qwen3-235B-A22B', 'huggingface:Qwen/Qwen3-32B', 'huggingface:Qwen/QwQ-32B', 'mistral:codestral-latest', 'mistral:mistral-large-latest', 'mistral:mistral-moderation-latest', 'mistral:mistral-small-latest', 'moonshotai:kimi-k2-0711-preview', 'moonshotai:kimi-latest', 'moonshotai:kimi-thinking-preview', 'moonshotai:moonshot-v1-128k-vision-preview', 'moonshotai:moonshot-v1-128k', 'moonshotai:moonshot-v1-32k-vision-preview', 'moonshotai:moonshot-v1-32k', 'moonshotai:moonshot-v1-8k-vision-preview', 'moonshotai:moonshot-v1-8k', 'openai:computer-use-preview-2025-03-11', 'openai:computer-use-preview', 'openai:gpt-3.5-turbo-0125', 'openai:gpt-3.5-turbo-0301', 'openai:gpt-3.5-turbo-0613', 'openai:gpt-3.5-turbo-1106', 'openai:gpt-3.5-turbo-16k-0613', 'openai:gpt-3.5-turbo-16k', 'openai:gpt-3.5-turbo', 'openai:gpt-4-0314', 'openai:gpt-4-0613', 'openai:gpt-4-turbo-2024-04-09', 'openai:gpt-4-turbo', 'openai:gpt-4.1-2025-04-14', 'openai:gpt-4.1-mini-2025-04-14', 'openai:gpt-4.1-mini', 'openai:gpt-4.1-nano-2025-04-14', 'openai:gpt-4.1-nano', 'openai:gpt-4.1', 'openai:gpt-4', 'openai:gpt-4o-2024-05-13', 'openai:gpt-4o-2024-08-06', 'openai:gpt-4o-2024-11-20', 'openai:gpt-4o-audio-preview-2024-12-17', 'openai:gpt-4o-audio-preview-2025-06-03', 'openai:gpt-4o-audio-preview', 'openai:gpt-4o-mini-2024-07-18', 'openai:gpt-4o-mini-audio-preview-2024-12-17', 'openai:gpt-4o-mini-audio-preview', 'openai:gpt-4o-mini-search-preview-2025-03-11', 'openai:gpt-4o-mini-search-preview', 'openai:gpt-4o-mini', 'openai:gpt-4o-search-preview-2025-03-11', 'openai:gpt-4o-search-preview', 'openai:gpt-4o', 'openai:gpt-5-2025-08-07', 'openai:gpt-5-chat-latest', 'openai:gpt-5-codex', 'openai:gpt-5-mini-2025-08-07', 'openai:gpt-5-mini', 'openai:gpt-5-nano-2025-08-07', 'openai:gpt-5-nano', 'openai:gpt-5-pro-2025-10-06', 'openai:gpt-5-pro', 'openai:gpt-5.1-2025-11-13', 'openai:gpt-5.1-chat-latest', 'openai:gpt-5.1-codex-max', 'openai:gpt-5.1-codex', 'openai:gpt-5.1', 'openai:gpt-5.2-2025-12-11', 'openai:gpt-5.2-chat-latest', 'openai:gpt-5.2-pro-2025-12-11', 'openai:gpt-5.2-pro', 'openai:gpt-5.2', 'openai:gpt-5.3-chat-latest', 'openai:gpt-5.4-mini-2026-03-17', 'openai:gpt-5.4-mini', 'openai:gpt-5.4-nano-2026-03-17', 'openai:gpt-5.4-nano', 'openai:gpt-5.4', 'openai:gpt-5', 'openai:o1-2024-12-17', 'openai:o1-pro-2025-03-19', 'openai:o1-pro', 'openai:o1', 'openai:o3-2025-04-16', 'openai:o3-deep-research-2025-06-26', 'openai:o3-deep-research', 'openai:o3-mini-2025-01-31', 'openai:o3-mini', 'openai:o3-pro-2025-06-10', 'openai:o3-pro', 'openai:o3', 'openai:o4-mini-2025-04-16', 'openai:o4-mini-deep-research-2025-06-26', 'openai:o4-mini-deep-research', 'openai:o4-mini', 'test'])`

### ALLOW\_MODEL\_REQUESTS

Whether to allow requests to models.

This global setting allows you to disable request to most models, e.g. to make sure you don't accidentally make costly requests to a model during tests.

The testing models [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) and [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) are no affected by this setting.

**Default:** `True`

---

# [pydantic_ai.models.bedrock](https://pydantic.dev/docs/ai/api/models/bedrock/)

# pydantic\_ai.models.bedrock

## Setup

For details on how to set up authentication with this model, see [model configuration for Bedrock](/docs/ai/models/bedrock).

### BedrockModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings for Bedrock models.

See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API.

#### Attributes

##### bedrock\_guardrail\_config

Content moderation and safety settings for Bedrock API requests.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_GuardrailConfiguration.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_GuardrailConfiguration.html).

**Type:** `GuardrailConfigurationTypeDef`

##### bedrock\_performance\_configuration

Performance optimization settings for model inference.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_PerformanceConfiguration.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_PerformanceConfiguration.html).

**Type:** `PerformanceConfigurationTypeDef`

##### bedrock\_request\_metadata

Additional metadata to attach to Bedrock API requests.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_Converse.html#API\_runtime\_Converse\_RequestSyntax](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax).

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### bedrock\_additional\_model\_response\_fields\_paths

JSON paths to extract additional fields from model responses.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### bedrock\_prompt\_variables

Variables for substitution into prompt templates.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/APIReference/API\_runtime\_PromptVariableValues.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_PromptVariableValues.html).

**Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `PromptVariableValuesTypeDef`\]

##### bedrock\_additional\_model\_requests\_fields

Additional model-specific parameters to include in requests.

See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html).

**Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### bedrock\_cache\_tool\_definitions

Whether to add a cache point after the last tool definition.

When enabled, the last tool in the `tools` array will include a `cachePoint`, allowing Bedrock to cache tool definitions and reduce costs for compatible models.

Set to `True` or `'5m'` for a 5-minute TTL (the default), or `'1h'` for a 1-hour TTL. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### bedrock\_cache\_instructions

Whether to add a cache point after the system prompt blocks.

When enabled, an extra `cachePoint` is appended to the system prompt so Bedrock can cache system instructions.

Set to `True` or `'5m'` for a 5-minute TTL (the default), or `'1h'` for a 1-hour TTL. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### bedrock\_cache\_messages

Convenience setting to enable caching for the last user message.

When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations.

Set to `True` or `'5m'` for a 5-minute TTL (the default), or `'1h'` for a 1-hour TTL.

Note: Uses 1 of Bedrock's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\]

##### bedrock\_service\_tier

Setting for optimizing performance and cost.

Accepts `{'type': 'default' | 'flex' | 'priority' | 'reserved'}`. Takes precedence over the top-level [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier), and is the only way to request `'reserved'` (which requires a pre-purchased capacity reservation).

See more about it on [https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html](https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html).

**Type:** `ServiceTierTypeDef`

##### bedrock\_inference\_profile

An [inference profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html) ARN to use as the `modelId` in API requests.

When set, this value is used as the `modelId` in `converse` and `converse_stream` API calls instead of the base `model_name`. This allows you to pass the base model name (e.g. `'anthropic.claude-sonnet-4-5-20250929-v1:0'`) as `model_name` for detecting model capabilities and token counting, while routing requests through an inference profile for cost tracking or cross-region inference.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### BedrockConverseModel

**Bases:** `Model[BaseClient]`

A model that uses the Bedrock Converse API.

#### Attributes

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: BedrockModelName,
    provider: Literal['bedrock', 'gateway'] | Provider[BaseClient] = 'bedrock',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize a Bedrock model.

###### Parameters

**`model_name`** : `BedrockModelName`

The name of the model to use.

**`model_name`** : `BedrockModelName`

The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['bedrock', 'gateway'\] | `Provider`\[`BaseClient`\] _Default:_ `'bedrock'`

The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

The set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

##### count\_tokens

`@async`

```python
def count_tokens(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> usage.RequestUsage
```

Count the number of tokens, works with limited models.

Check the actual supported models on [https://docs.aws.amazon.com/bedrock/latest/userguide/count-tokens.html](https://docs.aws.amazon.com/bedrock/latest/userguide/count-tokens.html)

###### Returns

[`usage.RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

### BedrockStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Bedrock models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### LatestBedrockModelNames

Latest Bedrock models.

**Default:** `Literal['amazon.titan-tg1-large', 'amazon.titan-text-lite-v1', 'amazon.titan-text-express-v1', 'us.amazon.nova-2-lite-v1:0', 'us.amazon.nova-pro-v1:0', 'us.amazon.nova-lite-v1:0', 'us.amazon.nova-micro-v1:0', 'anthropic.claude-3-5-sonnet-20241022-v2:0', 'us.anthropic.claude-3-5-sonnet-20241022-v2:0', 'anthropic.claude-3-5-haiku-20241022-v1:0', 'us.anthropic.claude-3-5-haiku-20241022-v1:0', 'anthropic.claude-instant-v1', 'anthropic.claude-v2:1', 'anthropic.claude-v2', 'anthropic.claude-3-sonnet-20240229-v1:0', 'us.anthropic.claude-3-sonnet-20240229-v1:0', 'anthropic.claude-3-haiku-20240307-v1:0', 'us.anthropic.claude-3-haiku-20240307-v1:0', 'anthropic.claude-3-opus-20240229-v1:0', 'us.anthropic.claude-3-opus-20240229-v1:0', 'anthropic.claude-3-5-sonnet-20240620-v1:0', 'us.anthropic.claude-3-5-sonnet-20240620-v1:0', 'anthropic.claude-3-7-sonnet-20250219-v1:0', 'us.anthropic.claude-3-7-sonnet-20250219-v1:0', 'anthropic.claude-opus-4-20250514-v1:0', 'us.anthropic.claude-opus-4-20250514-v1:0', 'global.anthropic.claude-opus-4-5-20251101-v1:0', 'anthropic.claude-sonnet-4-20250514-v1:0', 'us.anthropic.claude-sonnet-4-20250514-v1:0', 'eu.anthropic.claude-sonnet-4-20250514-v1:0', 'anthropic.claude-sonnet-4-5-20250929-v1:0', 'us.anthropic.claude-sonnet-4-5-20250929-v1:0', 'eu.anthropic.claude-sonnet-4-5-20250929-v1:0', 'anthropic.claude-sonnet-4-6', 'us.anthropic.claude-sonnet-4-6', 'eu.anthropic.claude-sonnet-4-6', 'anthropic.claude-haiku-4-5-20251001-v1:0', 'us.anthropic.claude-haiku-4-5-20251001-v1:0', 'eu.anthropic.claude-haiku-4-5-20251001-v1:0', 'cohere.command-text-v14', 'cohere.command-r-v1:0', 'cohere.command-r-plus-v1:0', 'cohere.command-light-text-v14', 'meta.llama3-8b-instruct-v1:0', 'meta.llama3-70b-instruct-v1:0', 'meta.llama3-1-8b-instruct-v1:0', 'us.meta.llama3-1-8b-instruct-v1:0', 'meta.llama3-1-70b-instruct-v1:0', 'us.meta.llama3-1-70b-instruct-v1:0', 'meta.llama3-1-405b-instruct-v1:0', 'us.meta.llama3-2-11b-instruct-v1:0', 'us.meta.llama3-2-90b-instruct-v1:0', 'us.meta.llama3-2-1b-instruct-v1:0', 'us.meta.llama3-2-3b-instruct-v1:0', 'us.meta.llama3-3-70b-instruct-v1:0', 'mistral.mistral-7b-instruct-v0:2', 'mistral.mixtral-8x7b-instruct-v0:1', 'mistral.mistral-large-2402-v1:0', 'mistral.mistral-large-2407-v1:0']`

### BedrockModelName

Possible Bedrock model names.

Since Bedrock supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for a full list.

**Default:** `str | LatestBedrockModelNames`

---

# [pydantic_ai.models.cerebras](https://pydantic.dev/docs/ai/api/models/cerebras/)

# pydantic\_ai.models.cerebras

## Setup

For details on how to set up authentication with this model, see [model configuration for Cerebras](/docs/ai/models/cerebras).

Cerebras model implementation using OpenAI-compatible API.

### CerebrasModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Cerebras model request.

ALL FIELDS MUST BE `cerebras_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### cerebras\_disable\_reasoning

Disable reasoning for the model.

This setting is only supported on reasoning models: `zai-glm-4.6` and `gpt-oss-120b`.

See [the Cerebras docs](https://inference-docs.cerebras.ai/resources/openai#passing-non-standard-parameters) for more details.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

### CerebrasModel

**Bases:** `OpenAIChatModel`

A model that uses Cerebras's OpenAI-compatible API.

Cerebras provides ultra-fast inference powered by the Wafer-Scale Engine (WSE).

Apart from `__init__`, all methods are private or match those of the base class.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: CerebrasModelName,
    provider: Literal['cerebras'] | Provider[AsyncOpenAI] = 'cerebras',
    profile: ModelProfileSpec | None = None,
    settings: CerebrasModelSettings | None = None,
)
```

Initialize a Cerebras model.

###### Parameters

**`model_name`** : `CerebrasModelName`

The name of the Cerebras model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cerebras'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'cerebras'`

The provider to use. Defaults to 'cerebras'.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile based on the model name.

**`settings`** : `CerebrasModelSettings` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

### CerebrasModelName

Possible Cerebras model names.

Since Cerebras supports a variety of models and the list changes frequently, we explicitly list known models but allow any name in the type hints.

See [https://inference-docs.cerebras.ai/models/overview](https://inference-docs.cerebras.ai/models/overview) for an up to date list of models.

**Default:** `str | LatestCerebrasModelNames`

---

# [pydantic_ai.models.cohere](https://pydantic.dev/docs/ai/api/models/cohere/)

# pydantic\_ai.models.cohere

## Setup

For details on how to set up authentication with this model, see [model configuration for Cohere](/docs/ai/models/cohere).

### CohereModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Cohere model request.

### CohereModel

**Bases:** `Model[AsyncClientV2]`

A model that uses the Cohere API.

Internally, this uses the [Cohere Python client](https://github.com/cohere-ai/cohere-python) to interact with the API.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** `CohereModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: CohereModelName,
    provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an Cohere model.

###### Parameters

**`model_name`** : `CohereModelName`

The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cohere'\] | `Provider`\[`AsyncClientV2`\] _Default:_ `'cohere'`

The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

### LatestCohereModelNames

Latest Cohere models.

**Default:** `Literal['c4ai-aya-expanse-32b', 'c4ai-aya-expanse-8b', 'command-nightly', 'command-r-08-2024', 'command-r-plus-08-2024', 'command-r7b-12-2024']`

### CohereModelName

Possible Cohere model names.

Since Cohere supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [Cohere's docs](https://docs.cohere.com/v2/docs/models) for a list of all available models.

**Default:** `str | LatestCohereModelNames`

---

# [pydantic_ai.models.fallback](https://pydantic.dev/docs/ai/api/models/fallback/)

# pydantic\_ai.models.fallback

### ResponseRejected

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Raised within a `FallbackExceptionGroup` when model responses are rejected by a response handler.

### FallbackModel

**Bases:** `Model`

A model that uses one or more fallback models upon failure.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_id

The fully qualified model identifier, combining the wrapped models' IDs.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    default_model: Model | KnownModelName | str,
    fallback_models: Model | KnownModelName | str = (),
    fallback_on: FallbackOn = (ModelAPIError,),
)
```

Initialize a fallback model instance.

###### Parameters

**`default_model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name or instance of the default model to use.

**`fallback_models`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `()`

The names or instances of the fallback models to use upon failure.

**`fallback_on`** : `FallbackOn` _Default:_ `(ModelAPIError,)`

Conditions that trigger fallback to the next model. Accepts:

-   A tuple of exception types: `(ModelAPIError, RateLimitError)`
-   An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)`
-   A response handler (sync or async): `def check(r: ModelResponse) -> bool`
-   A sequence mixing all of the above: `[ModelAPIError, exc_handler, response_handler]`

Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as `ModelResponse`, it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler.

##### \_\_aenter\_\_

`@async`

```python
def __aenter__() -> FallbackModel
```

Enter all sub-models so their providers can manage HTTP client lifecycle.

###### Returns

`FallbackModel`

##### \_\_aexit\_\_

`@async`

```python
def __aexit__(
    exc_type: type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: TracebackType | None,
) -> bool | None
```

Exit all sub-models, closing their providers' HTTP clients.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### request

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Try each model in sequence until one succeeds.

In case of failure, raise a FallbackExceptionGroup with all exceptions.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Try each model in sequence until one succeeds.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

### ExceptionHandler

A sync or async callable that decides whether an exception should trigger fallback.

**Default:** `Callable[[Exception], Awaitable[bool]] | Callable[[Exception], bool]`

### ResponseHandler

A sync or async callable that decides whether a model response should trigger fallback.

**Default:** `Callable[[ModelResponse], Awaitable[bool]] | Callable[[ModelResponse], bool]`

### FallbackOn

The type of the `fallback_on` parameter to [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel).

**Default:** `type[Exception] | tuple[type[Exception], ...] | ExceptionHandler | ResponseHandler | Sequence[type[Exception] | ExceptionHandler | ResponseHandler]`

---

# [pydantic_ai.models.function](https://pydantic.dev/docs/ai/api/models/function/)

# pydantic\_ai.models.function

A model controlled by a local function.

[`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) is similar to [`TestModel`](/docs/ai/models/test), but allows greater control over the model's behavior.

Its primary use case is for more advanced unit testing than is possible with `TestModel`.

Here's a minimal example:

function\_model\_usage.py

```python
from pydantic_ai import Agent
from pydantic_ai import ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.function import FunctionModel, AgentInfo

my_agent = Agent('openai:gpt-5.2')


async def model_function(
    messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
    print(messages)
    """
    [
        ModelRequest(
            parts=[
                UserPromptPart(
                    content='Testing my agent...',
                    timestamp=datetime.datetime(...),
                )
            ],
            timestamp=datetime.datetime(...),
            run_id='...',
            conversation_id='...',
        )
    ]
    """
    print(info)
    """
    AgentInfo(
        function_tools=[],
        allow_text_output=True,
        output_tools=[],
        model_settings=None,
        model_request_parameters=ModelRequestParameters(
            function_tools=[], builtin_tools=[], output_tools=[]
        ),
        instructions=None,
    )
    """
    return ModelResponse(parts=[TextPart('hello world')])


async def test_my_agent():
    """Unit test for my_agent, to be run by pytest."""
    with my_agent.override(model=FunctionModel(model_function)):
        result = await my_agent.run('Testing my agent...')
        assert result.output == 'hello world'
```

See [Unit testing with `FunctionModel`](/docs/ai/guides/testing#unit-testing-with-functionmodel) for detailed documentation.

### FunctionModel

**Bases:** `Model`

A model controlled by a local function.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The system / model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    function: FunctionDef,
    model_name: str | None = None,
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
) -> None
def __init__(
    stream_function: StreamFunctionDef,
    model_name: str | None = None,
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
) -> None
def __init__(
    function: FunctionDef,
    stream_function: StreamFunctionDef,
    model_name: str | None = None,
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
) -> None
```

Initialize a `FunctionModel`.

Either `function` or `stream_function` must be provided, providing both is allowed.

###### Parameters

**`function`** : `FunctionDef` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The function to call for non-streamed requests.

**`stream_function`** : `StreamFunctionDef` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The function to call for streamed requests.

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the model. If not provided, a name is generated from the function names.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

FunctionModel supports all builtin tools for testing flexibility.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### AgentInfo

Information about an agent.

This is passed as the second to functions used within [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel).

#### Attributes

##### function\_tools

The function tools available on this agent.

These are the tools registered via the [`tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) and [`tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorators.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\]

##### allow\_text\_output

Whether a plain text output is allowed.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### output\_tools

The tools that can called to produce the final output of the run.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\]

##### model\_settings

The model settings passed to the run call.

**Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### model\_request\_parameters

The model request parameters passed to the run call.

**Type:** `ModelRequestParameters`

##### instructions

The instructions passed to model.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

### DeltaToolCall

Incremental change to a tool call.

Used to describe a chunk when streaming structured responses.

#### Attributes

##### name

Incremental change to the name of the tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### json\_args

Incremental change to the arguments as JSON

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_call\_id

Incremental change to the tool call ID.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### DeltaThinkingPart

Incremental change to a thinking part.

Used to describe a chunk when streaming thinking responses.

#### Attributes

##### content

Incremental change to the thinking content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### signature

Incremental change to the thinking signature.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### FunctionStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for [FunctionModel](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel).

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`None`](https://docs.python.org/3/library/constants.html#None)

##### provider\_url

Get the provider base URL.

**Type:** [`None`](https://docs.python.org/3/library/constants.html#None)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### DeltaToolCalls

A mapping of tool call IDs to incremental changes.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[int, DeltaToolCall]`

### DeltaThinkingCalls

A mapping of thinking call IDs to incremental changes.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[int, DeltaThinkingPart]`

### FunctionDef

A function used to generate a non-streamed response.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[list[ModelMessage], AgentInfo], ModelResponse | Awaitable[ModelResponse]]`

### StreamFunctionDef

A function used to generate a streamed response.

While this is defined as having return type of `AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinTools]`, it should really be considered as `AsyncIterator[str] | AsyncIterator[DeltaToolCalls] | AsyncIterator[DeltaThinkingCalls]`,

E.g. you need to yield all text, all `DeltaToolCalls`, all `DeltaThinkingCalls`, or all `BuiltinToolCallsReturns`, not mix them.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[list[ModelMessage], AgentInfo], AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns]]`

---

# [pydantic_ai.models.google](https://pydantic.dev/docs/ai/api/models/google/)

# pydantic\_ai.models.google

Interface that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI.

## Setup

For details on how to set up authentication with this model, see [model configuration for Google](/docs/ai/models/google).

### GoogleModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Gemini model request.

#### Attributes

##### google\_safety\_settings

The safety settings to use for the model.

See [https://ai.google.dev/gemini-api/docs/safety-settings](https://ai.google.dev/gemini-api/docs/safety-settings) for more information.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SafetySettingDict`\]

##### google\_thinking\_config

The thinking configuration to use for the model.

See [https://ai.google.dev/gemini-api/docs/thinking](https://ai.google.dev/gemini-api/docs/thinking) for more information.

**Type:** `ThinkingConfigDict`

##### google\_labels

User-defined metadata to break down billed charges. Only supported by the Vertex AI API.

See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### google\_video\_resolution

The video resolution to use for the model.

See [https://ai.google.dev/api/generate-content#MediaResolution](https://ai.google.dev/api/generate-content#MediaResolution) for more information.

**Type:** `MediaResolution`

##### google\_cached\_content

The name of the cached content to use for the model.

See [https://ai.google.dev/gemini-api/docs/caching](https://ai.google.dev/gemini-api/docs/caching) for more information.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### google\_logprobs

Include log probabilities in the response.

See [https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens) for more information.

Note: Only supported for Vertex AI and non-streaming requests.

These will be included in `ModelResponse.provider_details['logprobs']`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### google\_top\_logprobs

Include log probabilities of the top n tokens in the response.

See [https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/multimodal/content-generation-parameters#log-probabilities-output-tokens) for more information.

Note: Only supported for Vertex AI and non-streaming requests.

These will be included in `ModelResponse.provider_details['logprobs']`.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### google\_vertex\_service\_tier

The service tier to use for the model request when using Vertex AI.

Controls routing for Provisioned Throughput, Flex PayGo, and Priority PayGo (e.g., `'pt_only'`, `'flex_only'`, `'priority_only'`).

See [`GoogleVertexServiceTier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleVertexServiceTier) for all values, headers sent, and links to Google docs.

**Type:** `GoogleVertexServiceTier`

##### google\_service\_tier

Deprecated: use `service_tier` for Gemini API (GLA) or `google_vertex_service_tier` for Vertex AI.

**Type:** `GoogleServiceTier`

### GoogleModel

**Bases:** `Model[Client]`

A model that uses Gemini via `generativelanguage.googleapis.com` API.

This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api).

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** `GoogleModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: GoogleModelName,
    provider: Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client] = 'google-gla',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize a Gemini model.

###### Parameters

**`model_name`** : `GoogleModelName`

The name of the model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['google-gla', 'google-vertex', 'gateway'\] | `Provider`\[`Client`\] _Default:_ `'google-gla'`

The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[google.genai.AsyncClient]`. Defaults to 'google-gla'.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model settings to use. Defaults to None.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### GeminiStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for the Gemini model.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `GoogleModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### LatestGoogleModelNames

Latest Gemini models.

**Default:** `Literal['gemini-flash-latest', 'gemini-flash-lite-latest', 'gemini-2.0-flash', 'gemini-2.0-flash-lite', 'gemini-2.5-flash', 'gemini-2.5-flash-preview-09-2025', 'gemini-2.5-flash-image', 'gemini-2.5-flash-lite', 'gemini-2.5-flash-lite-preview-09-2025', 'gemini-2.5-pro', 'gemini-3-flash-preview', 'gemini-3-pro-image-preview', 'gemini-3-pro-preview', 'gemini-3.1-flash-image-preview', 'gemini-3.1-flash-lite-preview', 'gemini-3.1-pro-preview']`

### GoogleModelName

Possible Gemini model names.

Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Gemini API docs](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations) for a full list.

**Default:** `str | LatestGoogleModelNames`

### GoogleVertexServiceTier

Values for the `google_vertex_service_tier` field on [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings).

Controls Vertex AI HTTP headers for [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput/use-provisioned-throughput) (PT), [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo), and [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo).

-   `'pt_then_on_demand'` (**default**): PT when quota allows, then standard on-demand spillover. No headers sent.
-   `'pt_only'`: PT only (`X-Vertex-AI-LLM-Request-Type: dedicated`). No on-demand spillover; returns 429 when over quota.
-   `'pt_then_flex'`: PT when quota allows, then [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) spillover (`X-Vertex-AI-LLM-Shared-Request-Type: flex`).
-   `'pt_then_priority'`: PT when quota allows, then [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) spillover (`X-Vertex-AI-LLM-Shared-Request-Type: priority`).
-   `'on_demand'`: Standard on-demand only (`X-Vertex-AI-LLM-Request-Type: shared`). Bypasses PT for this request.
-   `'flex_only'`: [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) only (`X-Vertex-AI-LLM-Request-Type: shared` and `X-Vertex-AI-LLM-Shared-Request-Type: flex`). Bypasses PT.
-   `'priority_only'`: [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) only (`X-Vertex-AI-LLM-Request-Type: shared` and `X-Vertex-AI-LLM-Shared-Request-Type: priority`). Bypasses PT.

Not every model or region supports every value; see the linked Google docs.

**Default:** `Literal['pt_then_on_demand', 'pt_only', 'pt_then_flex', 'pt_then_priority', 'on_demand', 'flex_only', 'priority_only']`

### GoogleServiceTier

Deprecated alias for service tier values.

Use [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier) for Gemini API (GLA) or [`google_vertex_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_vertex_service_tier) for Vertex AI.

**Default:** `Literal['pt_then_on_demand', 'pt_only', 'pt_then_flex', 'pt_then_priority', 'on_demand', 'flex_only', 'priority_only']`

---

# [pydantic_ai.models.groq](https://pydantic.dev/docs/ai/api/models/groq/)

# pydantic\_ai.models.groq

## Setup

For details on how to set up authentication with this model, see [model configuration for Groq](/docs/ai/models/groq).

### GroqModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Groq model request.

#### Attributes

##### groq\_reasoning\_format

The format of the reasoning output.

See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['hidden', 'raw', 'parsed'\]

### GroqModel

**Bases:** `Model[AsyncGroq]`

A model that uses the Groq API.

Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** `GroqModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: GroqModelName,
    provider: Literal['groq', 'gateway'] | Provider[AsyncGroq] = 'groq',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize a Groq model.

###### Parameters

**`model_name`** : `GroqModelName`

The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['groq', 'gateway'\] | `Provider`\[`AsyncGroq`\] _Default:_ `'groq'`

The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### GroqStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Groq models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `GroqModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### ProductionGroqModelNames

Production Groq models from [https://console.groq.com/docs/models#production-models](https://console.groq.com/docs/models#production-models).

**Default:** `Literal['llama-3.1-8b-instant', 'llama-3.3-70b-versatile', 'meta-llama/llama-guard-4-12b', 'openai/gpt-oss-120b', 'openai/gpt-oss-20b', 'whisper-large-v3', 'whisper-large-v3-turbo']`

### PreviewGroqModelNames

Preview Groq models from [https://console.groq.com/docs/models#preview-models](https://console.groq.com/docs/models#preview-models).

**Default:** `Literal['meta-llama/llama-4-maverick-17b-128e-instruct', 'meta-llama/llama-4-scout-17b-16e-instruct', 'meta-llama/llama-prompt-guard-2-22m', 'meta-llama/llama-prompt-guard-2-86m', 'moonshotai/kimi-k2-instruct-0905', 'openai/gpt-oss-safeguard-20b', 'playai-tts', 'playai-tts-arabic', 'qwen/qwen-3-32b']`

### GroqModelName

Possible Groq model names.

Since Groq supports a variety of models and the list changes frequently, we explicitly list the named models as of 2025-03-31 but allow any name in the type hints.

See [https://console.groq.com/docs/models](https://console.groq.com/docs/models) for an up to date list of models and more details.

**Default:** `str | ProductionGroqModelNames | PreviewGroqModelNames`

---

# [pydantic_ai.models.huggingface](https://pydantic.dev/docs/ai/api/models/huggingface/)

# pydantic\_ai.models.huggingface

## Setup

For details on how to set up authentication with this model, see [model configuration for Hugging Face](/docs/ai/models/huggingface).

### HuggingFaceModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Hugging Face model request.

### HuggingFaceModel

**Bases:** `Model[AsyncInferenceClient]`

A model that uses Hugging Face Inference Providers.

Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### base\_url

The base URL of the provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_name

The model name.

**Type:** `HuggingFaceModelName`

##### system

The system / model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: str,
    provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize a Hugging Face model.

###### Parameters

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['huggingface'\] | `Provider`\[`AsyncInferenceClient`\] _Default:_ `'huggingface'`

The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

### HuggingFaceStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Hugging Face models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### LatestHuggingFaceModelNames

Latest Hugging Face models.

**Default:** `Literal['deepseek-ai/DeepSeek-R1', 'meta-llama/Llama-3.3-70B-Instruct', 'meta-llama/Llama-4-Maverick-17B-128E-Instruct', 'meta-llama/Llama-4-Scout-17B-16E-Instruct', 'Qwen/QwQ-32B', 'Qwen/Qwen2.5-72B-Instruct', 'Qwen/Qwen3-235B-A22B', 'Qwen/Qwen3-32B']`

### HuggingFaceModelName

Possible Hugging Face model names.

You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending).

**Default:** `str | LatestHuggingFaceModelNames`

---

# [pydantic_ai.models.instrumented](https://pydantic.dev/docs/ai/api/models/instrumented/)

# pydantic\_ai.models.instrumented

### InstrumentationSettings

Options for instrumenting models and agents with OpenTelemetry.

Used in:

-   `Agent(instrument=...)`
-   [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all)
-   [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel)

See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    tracer_provider: TracerProvider | None = None,
    meter_provider: MeterProvider | None = None,
    include_binary_content: bool = True,
    include_content: bool = True,
    version: Literal[1, 2, 3, 4, 5] = DEFAULT_INSTRUMENTATION_VERSION,
    event_mode: Literal['attributes', 'logs'] = 'attributes',
    logger_provider: LoggerProvider | None = None,
    use_aggregated_usage_attribute_names: bool = False,
)
```

Create instrumentation options.

###### Parameters

**`tracer_provider`** : `TracerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this.

**`meter_provider`** : `MeterProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this.

**`include_binary_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include binary content in the instrumentation events.

**`include_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events.

**`version`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[1, 2, 3, 4, 5\] _Default:_ `DEFAULT_INSTRUMENTATION_VERSION`

Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes:

-   `gen_ai.system_instructions` for instructions passed to the agent.
-   `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans.
-   `pydantic_ai.all_messages` on agent run spans. Version 3 is the same as version 2, with additional support for thinking tokens. Version 4 is the same as version 3, with GenAI semantic conventions for multimodal content: URL-based media uses type='uri' with uri and mime\_type fields (and modality for image/audio/video). Inline binary content uses type='blob' with mime\_type and content fields (and modality for image/audio/video). [https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example) Version 5 is the same as version 4, but CallDeferred and ApprovalRequired exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors.

**`event_mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['attributes', 'logs'\] _Default:_ `'attributes'`

The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events.

**`logger_provider`** : `LoggerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`.

**`use_aggregated_usage_attribute_names`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to use `gen_ai.aggregated_usage.*` attribute names for token usage on agent run spans instead of the standard `gen_ai.usage.*` names. Enable this to prevent double-counting in observability backends that aggregate span attributes across parent and child spans. Defaults to False. Note: `gen_ai.aggregated_usage.*` is a custom namespace, not part of the OpenTelemetry Semantic Conventions. It may be updated if OTel introduces an official convention.

##### messages\_to\_otel\_events

```python
def messages_to_otel_events(
    messages: list[ModelMessage],
    parameters: ModelRequestParameters | None = None,
) -> list[LogRecord]
```

Convert a list of model messages to OpenTelemetry events.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\] -- A list of OpenTelemetry events.

###### Parameters

**`messages`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

The messages to convert.

**`parameters`** : `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model request parameters.

### InstrumentedModel

**Bases:** `WrapperModel`

Model which wraps another model so that requests are instrumented with OpenTelemetry.

See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

#### Attributes

##### instrumentation\_settings

Instrumentation settings for this model.

**Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()`

### CostCalculationFailedWarning

**Bases:** [`Warning`](https://docs.python.org/3/library/exceptions.html#Warning)

Warning raised when cost calculation fails.

### instrument\_model

```python
def instrument_model(model: Model, instrument: InstrumentationSettings | bool) -> Model
```

Instrument a model with OpenTelemetry/logfire.

#### Returns

`Model`

---

# [pydantic_ai.models.mcp_sampling](https://pydantic.dev/docs/ai/api/models/mcp-sampling/)

# pydantic\_ai.models.mcp\_sampling

### MCPSamplingModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for an MCP Sampling model request.

#### Attributes

##### mcp\_model\_preferences

Model preferences to use for MCP Sampling.

**Type:** `ModelPreferences`

### MCPSamplingModel

**Bases:** `Model`

A model that uses MCP Sampling.

[MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it.

#### Attributes

##### session

The MCP server session to use for sampling.

**Type:** `ServerSession`

##### default\_max\_tokens

Default max tokens to use if not set in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.max_tokens).

Max tokens is a required parameter for MCP Sampling, but optional on [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings), so this value is used as fallback.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `16384`

##### model\_name

The model name.

Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The system / model provider, returns `'MCP'`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

---

# [pydantic_ai.models.mistral](https://pydantic.dev/docs/ai/api/models/mistral/)

# pydantic\_ai.models.mistral

## Setup

For details on how to set up authentication with this model, see [model configuration for Mistral](/docs/ai/models/mistral).

### MistralModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for a Mistral model request.

### MistralModel

**Bases:** `Model[Mistral]`

A model that uses Mistral.

Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API.

[API Documentation](https://docs.mistral.ai/)

#### Attributes

##### model\_name

The model name.

**Type:** `MistralModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: MistralModelName,
    provider: Literal['mistral'] | Provider[Mistral] = 'mistral',
    profile: ModelProfileSpec | None = None,
    json_mode_schema_prompt: str = 'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n',
    settings: ModelSettings | None = None,
)
```

Initialize a Mistral model.

###### Parameters

**`model_name`** : `MistralModelName`

The name of the model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['mistral'\] | `Provider`\[`Mistral`\] _Default:_ `'mistral'`

The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`json_mode_schema_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n'`

The prompt to show when the model expects a JSON object as input.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### request

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Make a non-streaming request to the model from Pydantic AI call.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Make a streaming request to the model from Pydantic AI call.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

### MistralStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Mistral models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `MistralModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### LatestMistralModelNames

Latest Mistral models.

**Default:** `Literal['mistral-large-latest', 'mistral-small-latest', 'codestral-latest', 'mistral-moderation-latest']`

### MistralModelName

Possible Mistral model names.

Since Mistral supports a variety of date-stamped models, we explicitly list the most popular models but allow any name in the type hints. Since [the Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list.

**Default:** `str | LatestMistralModelNames`

---

# [pydantic_ai.models.ollama](https://pydantic.dev/docs/ai/api/models/ollama/)

# pydantic\_ai.models.ollama

## Setup

For details on how to set up authentication with this model, see [model configuration for Ollama](/docs/ai/models/ollama).

Ollama model implementation using OpenAI-compatible API.

### OllamaModel

**Bases:** `OpenAIChatModel`

A model that uses Ollama's OpenAI-compatible Chat Completions API.

Self-hosted Ollama (v0.5.0+) honors `response_format` with `json_schema` via `llama.cpp`'s grammar-constrained decoder, so `NativeOutput` produces schema-valid output at generation time.

Ollama Cloud currently accepts `response_format` with `json_schema` without error but does not enforce the schema upstream (see [pydantic-ai#4917](https://github.com/pydantic/pydantic-ai/issues/4917) and [ollama/ollama#12362](https://github.com/ollama/ollama/issues/12362)). When this model detects a Cloud path -- either a `base_url` on `ollama.com` or a model name ending in `-cloud` -- it disables `supports_json_schema_output` on the resolved profile. With that flag off, [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) raises a clear [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) so users pick a mode that actually works on Cloud ([`ToolOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) -- the default -- and [`PromptedOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.PromptedOutput) are both verified to work).

Apart from `__init__`, all methods are inherited from the base class.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: str,
    provider: Literal['ollama'] | Provider[AsyncOpenAI] = 'ollama',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an Ollama model.

###### Parameters

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the Ollama model to use (e.g. `'qwen3'`, `'llama3.2'`).

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['ollama'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'ollama'`

The provider to use. Defaults to `'ollama'`.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name, adjusted to disable `supports_json_schema_output` when the request routes through Ollama Cloud.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

---

# [pydantic_ai.models.openai](https://pydantic.dev/docs/ai/api/models/openai/)

# pydantic\_ai.models.openai

## Setup

For details on how to set up authentication with this model, see [model configuration for OpenAI](/docs/ai/models/openai).

### OpenAIChatModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for an OpenAI model request.

#### Attributes

##### openai\_reasoning\_effort

Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning).

Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

**Type:** `ReasoningEffort`

##### openai\_logprobs

Include log probabilities in the response.

For Chat models, these will be included in `ModelResponse.provider_details['logprobs']`. For Responses models, these will be included in the response output parts `TextPart.provider_details['logprobs']`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_top\_logprobs

Include log probabilities of the top n tokens in the response.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### openai\_store

Whether or not to store the output of this request in OpenAI's systems.

If `False`, OpenAI will not store the request for its own internal review or training. See [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat-create-store).

When used with `OpenAIResponsesModel`, stored responses appear in OpenAI's dashboard and can be referenced via [`openai_previous_response_id`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_previous_response_id). Pair this with `openai_previous_response_id='auto'` to avoid storing duplicate copies of the conversation history across retries and subsequent requests within the same run.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### openai\_user

A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse.

See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### openai\_service\_tier

The service tier to use for the model request.

Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier).

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'default', 'flex', 'priority'\]

##### openai\_prediction

Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs).

This feature is currently only supported for some OpenAI models.

**Type:** `ChatCompletionPredictionContentParam`

##### openai\_prompt\_cache\_key

Used by OpenAI to cache responses for similar requests to optimize your cache hit rates.

See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### openai\_prompt\_cache\_retention

The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours.

See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['in\_memory', '24h'\]

##### openai\_continuous\_usage\_stats

When True, enables continuous usage statistics in streaming responses.

When enabled, the API returns cumulative usage data with each chunk rather than only at the end. This setting correctly handles the cumulative nature of these stats by using only the final usage values rather than summing all intermediate values.

See [OpenAI's streaming documentation](https://platform.openai.com/docs/api-reference/chat/create#stream_options) for more information.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

### OpenAIModelSettings

**Bases:** `OpenAIChatModelSettings`

Deprecated alias for `OpenAIChatModelSettings`.

### OpenAIResponsesModelSettings

**Bases:** `OpenAIChatModelSettings`

Settings used for an OpenAI Responses model request.

ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### openai\_builtin\_tools

The provided OpenAI built-in tools to use.

See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`FileSearchToolParam` | `WebSearchToolParam` | `ComputerToolParam`\]

##### openai\_reasoning\_generate\_summary

Deprecated alias for `openai_reasoning_summary`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['detailed', 'concise'\]

##### openai\_reasoning\_summary

A summary of the reasoning performed by the model.

This can be useful for debugging and understanding the model's reasoning process. One of `concise`, `detailed`, or `auto`.

Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['detailed', 'concise', 'auto'\]

##### openai\_send\_reasoning\_ids

Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models.

This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](/docs/ai/core-concepts/message-history#processing-message-history). In that case, you'll want to disable this.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_truncation

The truncation strategy to use for the model response.

It can be either:

-   `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error.
-   `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['disabled', 'auto'\]

##### openai\_text\_verbosity

Constrains the verbosity of the model's text response.

Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\]

##### openai\_previous\_response\_id

Reference a prior OpenAI response to continue a conversation server-side, omitting already-stored messages from the input.

-   `'auto'`: chain to the most recent `provider_response_id` in the message history. If the history contains no such response, no `previous_response_id` is sent.
-   A concrete response ID string: use it as the seed for the first request in the run (e.g. to continue from a prior turn). On subsequent in-run requests (retries, tool-call continuations), the most recent `provider_response_id` from the message history takes precedence so the chain extends correctly without re-sending messages that are already server-side.

In both cases, messages that precede the chosen response in the history are omitted from the input, since OpenAI reconstructs them from server-side state.

Requires the referenced response to have been stored (see `openai_store`, which defaults to `True` on OpenAI's side). Not compatible with Zero Data Retention.

See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto'\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### openai\_conversation\_id

Reference an OpenAI conversation to continue durable conversation state server-side.

-   `'auto'`: use the most recent OpenAI conversation ID from `ModelResponse.provider_details['conversation_id']` in the message history with the same Pydantic AI `conversation_id`, when available. If the history contains no such response, no `conversation` is sent.
-   A concrete conversation ID string: use it as the OpenAI Responses API `conversation` parameter.

When a matching conversation ID is found in message history, messages that precede that response are omitted from the input, since OpenAI reconstructs them from the server-side conversation.

Not compatible with [`openai_previous_response_id`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_previous_response_id).

See the [OpenAI conversation state documentation](https://platform.openai.com/docs/guides/conversation-state) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto'\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### openai\_include\_code\_execution\_outputs

Whether to include the code execution results in the response.

Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_include\_web\_search\_sources

Whether to include the web search results in the response.

Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_include\_file\_search\_results

Whether to include the file search results in the response.

Corresponds to the `file_search_call.results` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_include\_raw\_annotations

Whether to include the raw annotations in `TextPart.provider_details`.

When enabled, any annotations (e.g., citations from web search) will be available in the `provider_details['annotations']` field of text parts. This is opt-in since there may be overlap with native annotation support once added via [https://github.com/pydantic/pydantic-ai/issues/3126](https://github.com/pydantic/pydantic-ai/issues/3126).

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### openai\_context\_management

Context management configuration for the request.

This enables OpenAI's server-side automatic compaction inside the regular `/responses` call, as opposed to the standalone `/responses/compact` endpoint. See [OpenAI's compaction guide](https://developers.openai.com/api/docs/guides/compaction) for details.

The [`OpenAICompaction`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction) capability sets this automatically in its default (stateful) mode.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ContextManagement`\]

### OpenAIChatModel

**Bases:** `Model[AsyncOpenAI]`

A model that uses the OpenAI API.

Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API.

Apart from `__init__`, all methods are private or match those of the base class.

#### Attributes

##### model\_name

The model name.

**Type:** `OpenAIModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### profile

The model profile.

WebSearchTool is only supported if openai\_chat\_supports\_web\_search is True.

**Type:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: OpenAIModelName,
    provider: OpenAIChatCompatibleProvider | Literal['openai', 'openai-chat', 'gateway'] | Provider[AsyncOpenAI] = 'openai',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
) -> None
def __init__(
    model_name: OpenAIModelName,
    provider: OpenAIChatCompatibleProvider | Literal['openai', 'openai-chat', 'gateway'] | Provider[AsyncOpenAI] = 'openai',
    profile: ModelProfileSpec | None = None,
    system_prompt_role: OpenAISystemPromptRole | None = None,
    settings: ModelSettings | None = None,
) -> None
```

Initialize an OpenAI model.

###### Parameters

**`model_name`** : `OpenAIModelName`

The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API).

**`provider`** : `OpenAIChatCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai', 'openai-chat', 'gateway'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'`

The provider to use. Defaults to `'openai'`.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`system_prompt_role`** : `OpenAISystemPromptRole` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### OpenAIModel

**Bases:** `OpenAIChatModel`

Deprecated alias for `OpenAIChatModel`.

### OpenAIResponsesModel

**Bases:** `Model[AsyncOpenAI]`

A model that uses the OpenAI Responses API.

The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models.

If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions).

#### Attributes

##### model\_name

The model name.

**Type:** `OpenAIModelName`

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: OpenAIModelName,
    provider: OpenAIResponsesCompatibleProvider | Literal['openai', 'gateway'] | Provider[AsyncOpenAI] = 'openai',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an OpenAI Responses model.

###### Parameters

**`model_name`** : `OpenAIModelName`

The name of the OpenAI model to use.

**`provider`** : `OpenAIResponsesCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai', 'gateway'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'`

The provider to use. Defaults to `'openai'`.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

##### compact\_messages

`@async`

```python
def compact_messages(
    request_context: ModelRequestContext,
    instructions: str | None = None,
) -> ModelResponse
```

Compact messages using the OpenAI Responses compaction endpoint.

This calls OpenAI's `responses.compact` API to produce an encrypted compaction that summarizes the conversation history. The returned `ModelResponse` contains a single `CompactionPart` that must be round-tripped in subsequent requests.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- A `ModelResponse` with a single `CompactionPart` containing the encrypted compaction data.

###### Parameters

**`request_context`** : `ModelRequestContext`

The model request context containing messages, settings, and parameters.

**`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom instructions for the compaction summarization. If provided, these override the agent-level instructions.

### OpenAIStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for OpenAI models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `OpenAIModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### OpenAIResponsesStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for OpenAI Responses API.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** `OpenAIModelName`

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### OpenAICompaction

**Bases:** `AbstractCapability[AgentDepsT]`

Compaction capability for OpenAI Responses API.

Automatically compacts conversation history to keep long-running agent runs within manageable context limits. Two modes are supported, selected by the `stateless` flag:

-   **Stateful mode** (default, `stateless=False`): configures [OpenAI's server-side auto-compaction](https://developers.openai.com/api/docs/guides/compaction) via the `context_management` field on the regular `/responses` request. The server triggers compaction when input tokens cross a threshold, and the compacted item is returned alongside the normal response. Compatible with [`openai_previous_response_id='auto'`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_previous_response_id) and server-side conversation state.
    
    Configurable with `token_threshold` (`compact_threshold` on the API). If omitted, OpenAI picks a server-side default.
    
-   **Stateless mode** (`stateless=True`): calls the stateless `/responses/compact` endpoint from a `before_model_request` hook when your trigger condition is met. Use this in [ZDR](https://openai.com/enterprise-privacy/) environments where OpenAI must not retain conversation data, when you set `openai_store=False`, or when you need explicit out-of-band control over when compaction runs.
    
    Requires either `message_count_threshold` or a custom `trigger` callable.
    

If `stateless` is not set, it is inferred from which parameters you provide: passing any stateless-only parameter (`message_count_threshold` or `trigger`) implies `stateless=True`; otherwise stateful mode is used.

Example usage::

from pydantic\_ai import Agent from pydantic\_ai.models.openai import OpenAICompaction

# Stateful mode with OpenAI's server-side default threshold:

agent = Agent( 'openai-responses:gpt-5.2', capabilities=\[OpenAICompaction()\], )

# Stateful mode with a custom token threshold:

agent = Agent( 'openai-responses:gpt-5.2', capabilities=\[OpenAICompaction(token\_threshold=100\_000)\], )

# Stateless mode for ZDR environments or explicit control:

agent = Agent( 'openai-responses:gpt-5.2', capabilities=\[OpenAICompaction(message\_count\_threshold=20)\], )

#### Methods

##### \_\_init\_\_

```python
def __init__(
    stateless: bool | None = None,
    token_threshold: int | None = None,
    message_count_threshold: int | None = None,
    trigger: Callable[[list[ModelMessage]], bool] | None = None,
    instructions: str | None = None,
) -> None
```

Initialize the OpenAI compaction capability.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`stateless`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Select the compaction mode explicitly. If `None` (the default), the mode is inferred from the other parameters: passing any stateless-only parameter (`message_count_threshold` or `trigger`) implies `stateless=True`; otherwise stateful mode is used.

**`token_threshold`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Stateful-mode only. Input token threshold at which OpenAI's server-side compaction is triggered. Corresponds to `compact_threshold` in the `context_management` API field. If `None`, OpenAI picks a server-side default.

**`message_count_threshold`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Stateless-mode only. Compact when the message count exceeds this threshold.

**`trigger`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Stateless-mode only. Custom callable that decides whether to compact based on the current messages. Takes precedence over `message_count_threshold`.

**`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Deprecated. OpenAI's `/compact` endpoint treats `instructions` as a system/developer message inserted into the compaction model's context, not as a directive for how to summarize the conversation. This does not match [`AnthropicCompaction.instructions`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) semantics, so the field is deprecated and will be removed in a future version.

### DEPRECATED\_OPENAI\_MODELS

Models that are deprecated or don't exist but are still present in the OpenAI SDK's type definitions.

**Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `frozenset({'chatgpt-4o-latest', 'codex-mini-latest', 'gpt-4-0125-preview', 'gpt-4-1106-preview', 'gpt-4-turbo-preview', 'gpt-4-32k', 'gpt-4-32k-0314', 'gpt-4-32k-0613', 'gpt-4-vision-preview', 'gpt-4o-audio-preview-2024-10-01', 'gpt-5.1-mini', 'o1-mini', 'o1-mini-2024-09-12', 'o1-preview', 'o1-preview-2024-09-12'})`

### OpenAIModelName

Possible OpenAI model names.

Since OpenAI supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the OpenAI docs](https://platform.openai.com/docs/models) for a full list.

Using this more broad type for the model name instead of the ChatModel definition allows this model to be used more easily with other model types (ie, Ollama, Deepseek).

**Default:** `str | AllModels`

### MCP\_SERVER\_TOOL\_CONNECTOR\_URI\_SCHEME

Prefix for OpenAI connector IDs. OpenAI supports either a URL or a connector ID when passing MCP configuration to a model, by using that prefix like `x-openai-connector:<connector-id>` in a URL, you can pass a connector ID to a model.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['x-openai-connector'\] **Default:** `'x-openai-connector'`

---

# [pydantic_ai.models.openrouter](https://pydantic.dev/docs/ai/api/models/openrouter/)

# pydantic\_ai.models.openrouter

## Setup

For details on how to set up authentication with this model, see [model configuration for OpenRouter](/docs/ai/models/openrouter).

### OpenRouterProviderConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Represents the 'Provider' object from the OpenRouter API.

#### Attributes

##### order

List of provider slugs to try in order (e.g. \["anthropic", "openai"\]). [See details](https://openrouter.ai/docs/features/provider-routing#ordering-specific-providers)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterProviderName`\]

##### allow\_fallbacks

Whether to allow backup providers when the primary is unavailable. [See details](https://openrouter.ai/docs/features/provider-routing#disabling-fallbacks)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### require\_parameters

Only use providers that support all parameters in your request.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### data\_collection

Control whether to use providers that may store data. [See details](https://openrouter.ai/docs/features/provider-routing#requiring-providers-to-comply-with-data-policies)

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['allow', 'deny'\]

##### zdr

Restrict routing to only ZDR (Zero Data Retention) endpoints. [See details](https://openrouter.ai/docs/features/provider-routing#zero-data-retention-enforcement)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### only

List of provider slugs to allow for this request. [See details](https://openrouter.ai/docs/features/provider-routing#allowing-only-specific-providers)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterProviderName`\]

##### ignore

List of provider slugs to skip for this request. [See details](https://openrouter.ai/docs/features/provider-routing#ignoring-providers)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### quantizations

List of quantization levels to filter by (e.g. \["int4", "int8"\]). [See details](https://openrouter.ai/docs/features/provider-routing#quantization)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['int4', 'int8', 'fp4', 'fp6', 'fp8', 'fp16', 'bf16', 'fp32', 'unknown'\]\]

##### sort

Sort providers by price or throughput. (e.g. "price" or "throughput"). [See details](https://openrouter.ai/docs/features/provider-routing#provider-sorting)

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['price', 'throughput', 'latency'\]

##### max\_price

The maximum pricing you want to pay for this request. [See details](https://openrouter.ai/docs/features/provider-routing#max-price)

**Type:** `_OpenRouterMaxPrice`

### OpenRouterReasoning

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for reasoning tokens in OpenRouter requests.

Reasoning tokens allow models to show their step-by-step thinking process. You can configure this using either OpenAI-style effort levels or Anthropic-style token limits, but not both simultaneously.

#### Attributes

##### effort

OpenAI-style reasoning effort level. Cannot be used with max\_tokens.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['xhigh', 'high', 'medium', 'low', 'minimal', 'none'\]

##### max\_tokens

Anthropic-style specific token limit for reasoning. Cannot be used with effort.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### exclude

Whether to exclude reasoning tokens from the response. Default is False. All models support this.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### enabled

Whether to enable reasoning with default parameters. Default is inferred from effort or max\_tokens.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

### OpenRouterUsageConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for OpenRouter usage.

### OpenRouterModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings used for an OpenRouter model request.

#### Attributes

##### openrouter\_models

A list of fallback models.

These models will be tried, in order, if the main model returns an error. [See details](https://openrouter.ai/docs/features/model-routing#the-models-parameter)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### openrouter\_provider

OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime.

You can customize how your requests are routed using the provider object. [See more](https://openrouter.ai/docs/features/provider-routing)

**Type:** `OpenRouterProviderConfig`

##### openrouter\_preset

Presets allow you to separate your LLM configuration from your code.

Create and manage presets through the OpenRouter web application to control provider routing, model selection, system prompts, and other parameters, then reference them in OpenRouter API requests. [See more](https://openrouter.ai/docs/features/presets)

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### openrouter\_transforms

To help with prompts that exceed the maximum context size of a model.

Transforms work by removing or truncating messages from the middle of the prompt, until the prompt fits within the model's context window. [See more](https://openrouter.ai/docs/features/message-transforms)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`OpenRouterTransforms`\]

##### openrouter\_reasoning

To control the reasoning tokens in the request.

The reasoning config object consolidates settings for controlling reasoning strength across different models. [See more](https://openrouter.ai/docs/use-cases/reasoning-tokens)

**Type:** `OpenRouterReasoning`

##### openrouter\_usage

To control the usage of the model.

The usage config object consolidates settings for enabling detailed usage information. [See more](https://openrouter.ai/docs/use-cases/usage-accounting)

**Type:** `OpenRouterUsageConfig`

### OpenRouterModel

**Bases:** `OpenAIChatModel`

Extends OpenAIModel to capture extra metadata for Openrouter.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: str,
    provider: Literal['openrouter'] | Provider[AsyncOpenAI] = 'openrouter',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an OpenRouter model.

###### Parameters

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openrouter'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openrouter'`

The provider to use for authentication and API access. If not provided, a new provider will be created with the default settings.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

Return the set of builtin tool types this model can handle.

OpenRouter supports web search via its plugins system.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### OpenRouterStreamedResponse

**Bases:** `OpenAIStreamedResponse`

Implementation of `StreamedResponse` for OpenRouter models.

### KnownOpenRouterProviders

Known providers in the OpenRouter marketplace

**Default:** `Literal['z-ai', 'cerebras', 'venice', 'moonshotai', 'morph', 'stealth', 'wandb', 'klusterai', 'openai', 'sambanova', 'amazon-bedrock', 'mistral', 'nextbit', 'atoma', 'ai21', 'minimax', 'baseten', 'anthropic', 'featherless', 'groq', 'lambda', 'azure', 'ncompass', 'deepseek', 'hyperbolic', 'crusoe', 'cohere', 'mancer', 'avian', 'perplexity', 'novita', 'siliconflow', 'switchpoint', 'xai', 'inflection', 'fireworks', 'deepinfra', 'inference-net', 'inception', 'atlas-cloud', 'nvidia', 'alibaba', 'friendli', 'infermatic', 'targon', 'ubicloud', 'aion-labs', 'liquid', 'nineteen', 'cloudflare', 'nebius', 'chutes', 'enfer', 'crofai', 'open-inference', 'phala', 'gmicloud', 'meta', 'relace', 'parasail', 'together', 'google-ai-studio', 'google-vertex']`

### OpenRouterProviderName

Possible OpenRouter provider names.

Since OpenRouter is constantly updating their list of providers, we explicitly list some known providers but allow any name in the type hints. See [the OpenRouter API](https://openrouter.ai/docs/api-reference/list-available-providers) for a full list.

**Default:** `str | KnownOpenRouterProviders`

### OpenRouterTransforms

Available messages transforms for OpenRouter models with limited token windows.

Currently only supports 'middle-out', but is expected to grow in the future.

**Default:** `Literal['middle-out']`

---

# [pydantic_ai.models.outlines](https://pydantic.dev/docs/ai/api/models/outlines/)

# pydantic\_ai.models.outlines

## Setup

For details on how to set up this model, see [model configuration for Outlines](/docs/ai/models/outlines).

### OutlinesModel

**Bases:** `Model`

A model that relies on the Outlines library to run non API-based models.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model: OutlinesBaseModel | OutlinesAsyncBaseModel,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize an Outlines model.

###### Parameters

**`model`** : `OutlinesBaseModel` | `OutlinesAsyncBaseModel`

The Outlines model used for the model.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### from\_transformers

`@classmethod`

```python
def from_transformers(
    cls,
    hf_model: transformers.modeling_utils.PreTrainedModel,
    hf_tokenizer_or_processor: transformers.PreTrainedTokenizer | transformers.processing_utils.ProcessorMixin,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Create an Outlines model from a Hugging Face model and tokenizer.

###### Parameters

**`hf_model`** : `transformers.modeling_utils.PreTrainedModel`

The Hugging Face PreTrainedModel or any model that is compatible with the `transformers` API.

**`hf_tokenizer_or_processor`** : `transformers.PreTrainedTokenizer` | `transformers.processing_utils.ProcessorMixin`

Either a HuggingFace `PreTrainedTokenizer` or any tokenizer that is compatible with the `transformers` API, or a HuggingFace processor inheriting from `ProcessorMixin`. If a tokenizer is provided, a regular model will be used, while if you provide a processor, it will be a multimodal model.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### from\_llamacpp

`@classmethod`

```python
def from_llamacpp(
    cls,
    llama_model: llama_cpp.Llama,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Create an Outlines model from a LlamaCpp model.

###### Parameters

**`llama_model`** : `llama_cpp.Llama`

The llama\_cpp.Llama model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### from\_mlxlm

`@classmethod`

```python
def from_mlxlm(
    cls,
    mlx_model: nn.Module,
    mlx_tokenizer: transformers.PreTrainedTokenizer,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Create an Outlines model from a MLXLM model.

###### Parameters

**`mlx_model`** : `nn.Module`

The nn.Module model to use.

**`mlx_tokenizer`** : `transformers.PreTrainedTokenizer`

The PreTrainedTokenizer to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### from\_sglang

`@classmethod`

```python
def from_sglang(
    cls,
    base_url: str,
    api_key: str | None = None,
    model_name: str | None = None,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Create an Outlines model to send requests to an SGLang server.

###### Parameters

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The url of the SGLang server.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authenticating requests to the SGLang server.

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### from\_vllm\_offline

`@classmethod`

```python
def from_vllm_offline(
    cls,
    vllm_model: Any,
    provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Create an Outlines model from a vLLM offline inference model.

###### Parameters

**`vllm_model`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

The vllm.LLM local model to use.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['outlines'\] | `Provider`\[`OutlinesBaseModel`\] _Default:_ `'outlines'`

The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model profile to use. Defaults to a profile picked by the provider.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default model settings for this model instance.

##### format\_inference\_kwargs

```python
def format_inference_kwargs(model_settings: ModelSettings | None) -> dict[str, Any]
```

Format the model settings for the inference kwargs.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

### OutlinesStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for Outlines models.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

---

# [pydantic_ai.models.test](https://pydantic.dev/docs/ai/api/models/test/)

# pydantic\_ai.models.test

Utility model for quickly testing apps built with Pydantic AI.

Here's a minimal example:

test\_model\_usage.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

my_agent = Agent('openai:gpt-5.2', instructions='...')


async def test_my_agent():
    """Unit test for my_agent, to be run by pytest."""
    m = TestModel()
    with my_agent.override(model=m):
        result = await my_agent.run('Testing my agent...')
        assert result.output == 'success (no tool calls)'
    assert m.last_model_request_parameters.function_tools == []
```

See [Unit testing with `TestModel`](/docs/ai/guides/testing#unit-testing-with-testmodel) for detailed documentation.

### TestModel

**Bases:** `Model`

A model specifically for testing purposes.

This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response.

How useful this model is will vary significantly.

Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class.

#### Attributes

##### call\_tools

List of tools to call. If `'all'`, all tools will be called.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['all'\] **Default:** `call_tools`

##### custom\_output\_text

If set, this text is returned as the final output.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `custom_output_text`

##### custom\_output\_args

If set, these args will be passed to the output tool.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `custom_output_args`

##### seed

Seed for generating random data.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `seed`

##### last\_model\_request\_parameters

The last ModelRequestParameters passed to the model in a request.

The ModelRequestParameters contains information about the function and output tools available during request handling.

This is set when a request is made, so will reflect the function tools from the last step of the last run.

**Type:** `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    call_tools: list[str] | Literal['all'] = 'all',
    custom_output_text: str | None = None,
    custom_output_args: Any | None = None,
    seed: int = 0,
    model_name: str = 'test',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize TestModel with optional settings and profile.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]
```

TestModel supports all builtin tools for testing flexibility.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\]

### TestStreamedResponse

**Bases:** `StreamedResponse`

A structured response that streams test data.

#### Attributes

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

Get the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

---

# [pydantic_ai.models.wrapper](https://pydantic.dev/docs/ai/api/models/wrapper/)

# pydantic\_ai.models.wrapper

### CompletedStreamedResponse

**Bases:** `StreamedResponse`

A `StreamedResponse` that wraps an already-completed `ModelResponse`.

Used by durable execution integrations (Temporal, Prefect, DBOS) where the actual stream is consumed within a task/activity and only the final response is returned.

### WrapperModel

**Bases:** `Model`

Model which wraps another model.

Does nothing on its own, used as a base class.

#### Attributes

##### wrapped

The underlying model being wrapped.

**Type:** `Model` **Default:** `infer_model(wrapped)`

##### settings

Get the settings from the wrapped model.

**Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

---

# [pydantic_ai.models.xai](https://pydantic.dev/docs/ai/api/models/xai/)

# pydantic\_ai.models.xai

## Setup

For details on how to set up authentication with this model, see [model configuration for xAI](/docs/ai/models/xai).

xAI model implementation using [xAI SDK](https://github.com/xai-org/xai-sdk-python).

### XaiModelSettings

**Bases:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings)

Settings specific to xAI models.

See [xAI SDK documentation](https://docs.x.ai/docs) for more details on these parameters.

#### Attributes

##### xai\_logprobs

Whether to return log probabilities of the output tokens or not.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_top\_logprobs

An integer between 0 and 20 specifying the number of most likely tokens to return at each position.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### xai\_user

A unique identifier representing your end-user, which can help xAI to monitor and detect abuse.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### xai\_store\_messages

Whether to store messages on xAI's servers for conversation continuity.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_previous\_response\_id

The ID of the previous response to continue the conversation.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### xai\_include\_encrypted\_content

Whether to include the encrypted content in the response.

Corresponds to the `use_encrypted_content` value of the model settings in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_code\_execution\_output

Whether to include the code execution results in the response.

Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_web\_search\_output

Whether to include the web search results in the response.

Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_inline\_citations

Whether to include inline citations in the response.

Corresponds to the `inline_citations` option in the xAI `include` parameter.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_mcp\_output

Whether to include the MCP results in the response.

Corresponds to the `mcp_call.outputs` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_x\_search\_output

Whether to include the X search results in the response.

Corresponds to the `x_search_call.outputs` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_include\_collections\_search\_output

Whether to include the collections search results in the response.

Corresponds to the `collections_search_call.outputs` value of the `include` parameter in the Responses API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### xai\_reasoning\_effort

Reasoning effort level for Grok reasoning models.

See [https://docs.x.ai](https://docs.x.ai) for details.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'high'\]

### XSearch

**Bases:** `BuiltinOrLocalTool[AgentDepsT]`

X (Twitter) search capability for xAI models.

Uses the xAI model's native x\_search builtin tool. Only works with xAI models.

#### Attributes

##### allowed\_x\_handles

If provided, only posts from these X handles will be included (max 10). Requires builtin support.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_x_handles`

##### excluded\_x\_handles

If provided, posts from these X handles will be excluded (max 10). Requires builtin support.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `excluded_x_handles`

##### from\_date

If provided, only posts created on or after this datetime will be included.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `from_date`

##### to\_date

If provided, only posts created on or before this datetime will be included.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `to_date`

##### enable\_image\_understanding

Enable image analysis from X posts. Defaults to `False`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `enable_image_understanding`

##### enable\_video\_understanding

Enable video analysis from X content. Defaults to `False`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `enable_video_understanding`

##### include\_output

Include raw X search results in the response as [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart). Defaults to `False`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `include_output`

### XaiModel

**Bases:** `Model[AsyncClient]`

A model that uses the xAI SDK to interact with xAI models.

#### Attributes

##### model\_name

The model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: XaiModelName,
    provider: Literal['xai'] | Provider[AsyncClient] = 'xai',
    profile: ModelProfileSpec | None = None,
    settings: ModelSettings | None = None,
)
```

Initialize the xAI model.

###### Parameters

**`model_name`** : `XaiModelName`

The name of the xAI model to use (e.g., "grok-4-1-fast-non-reasoning")

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['xai'\] | `Provider`\[`AsyncClient`\] _Default:_ `'xai'`

The provider to use for API calls. Defaults to `'xai'`.

**`profile`** : `ModelProfileSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model profile specification. Defaults to a profile picked by the provider based on the model name.

**`settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model settings.

##### supported\_builtin\_tools

`@classmethod`

```python
def supported_builtin_tools(cls) -> frozenset[type]
```

Return the set of builtin tool types this model can handle.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\]

##### request

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Make a request to the xAI model.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Make a streaming request to the xAI model.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

### XaiStreamedResponse

**Bases:** `StreamedResponse`

Implementation of `StreamedResponse` for xAI SDK.

#### Attributes

##### system

The model provider system name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_url

Get the provider base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

The model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### XAI\_EFFORT\_MAP

Maps unified thinking values to xAI reasoning\_effort. xAI only supports 'low' and 'high'.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'high'\]\] **Default:** `{True: 'high', 'minimal': 'low', 'low': 'low', 'medium': 'high', 'high': 'high', 'xhigh': 'high'}`

### XaiModelName

Possible xAI model names.

**Default:** `str | ChatModel`

---

# [pydantic_ai.ag_ui](https://pydantic.dev/docs/ai/api/pydantic-ai/ag_ui/)

# pydantic\_ai.ag\_ui

Provides an AG-UI protocol adapter for the Pydantic AI agent.

This package provides seamless integration between pydantic-ai agents and ag-ui for building interactive AI applications with streaming event-based communication.

### AGUIApp

**Bases:** `Generic[AgentDepsT, OutputDataT]`, `Starlette`

ASGI application for running Pydantic AI agents with AG-UI protocol support.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    agent: AbstractAgent[AgentDepsT, OutputDataT],
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    on_complete: OnCompleteFunc[Any] | None = None,
    debug: bool = False,
    routes: Sequence[BaseRoute] | None = None,
    middleware: Sequence[Middleware] | None = None,
    exception_handlers: Mapping[Any, ExceptionHandler] | None = None,
    on_startup: Sequence[Callable[[], Any]] | None = None,
    on_shutdown: Sequence[Callable[[], Any]] | None = None,
    lifespan: Lifespan[Self] | None = None,
) -> None
```

An ASGI application that handles every request by running the agent and streaming the response.

Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use `AGUIAdapter.run_stream()` or `AGUIAdapter.dispatch_request()` instead.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`agent`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

The agent to run.

**`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION`

AG-UI protocol version controlling thinking/reasoning event format.

**`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data).

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`on_complete`** : `OnCompleteFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data.

**`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Boolean indicating if debug tracebacks should be returned on errors.

**`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of routes to serve incoming HTTP and WebSocket requests.

**`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints.

**`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions.

**`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions.

**`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions.

**`lifespan`** : `Lifespan`\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both.

### StateHandler

**Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol)

Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.

#### Attributes

##### state

Get the current state of the agent run.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

### StateDeps

**Bases:** `Generic[StateT]`

Dependency type that holds state.

This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`.

The state is set using the `state` setter by the `Adapter` when the run starts.

Implements the `StateHandler` protocol.

### handle\_ag\_ui\_request

`@async`

```python
def handle_ag_ui_request(
    agent: AbstractAgent[AgentDepsT, Any],
    request: Request,
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    on_complete: OnCompleteFunc[BaseEvent] | None = None,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
) -> Response
```

Handle an AG-UI request by running the agent and returning a streaming response.

#### Returns

`Response` -- A streaming Starlette response with AG-UI protocol events.

#### Parameters

**`agent`** : `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The agent to run.

**`request`** : `Request`

The Starlette request (e.g. from FastAPI) containing the AG-UI run input.

**`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION`

AG-UI protocol version controlling thinking/reasoning event format.

**`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data).

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`on_complete`** : `OnCompleteFunc`\[`BaseEvent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data.

**`manage_system_prompt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['server', 'client'\] _Default:_ `'server'`

Who owns the system prompt. See [`UIAdapter.manage_system_prompt`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.manage_system_prompt).

**`allowed_file_url_schemes`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset({'http', 'https'})`

URL schemes allowed for file URL parts from the client. See [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes).

### run\_ag\_ui

```python
def run_ag_ui(
    agent: AbstractAgent[AgentDepsT, Any],
    run_input: RunAgentInput,
    accept: str = SSE_CONTENT_TYPE,
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    on_complete: OnCompleteFunc[BaseEvent] | None = None,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
) -> AsyncIterator[str]
```

Run the agent with the AG-UI run input and stream AG-UI protocol events.

#### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

#### Parameters

**`agent`** : `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The agent to run.

**`run_input`** : `RunAgentInput`

The AG-UI run input containing thread\_id, run\_id, messages, etc.

**`accept`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `SSE_CONTENT_TYPE`

The accept header value for the run.

**`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION`

AG-UI protocol version controlling thinking/reasoning event format.

**`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data).

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`on_complete`** : `OnCompleteFunc`\[`BaseEvent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data.

**`manage_system_prompt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['server', 'client'\] _Default:_ `'server'`

Who owns the system prompt. See [`UIAdapter.manage_system_prompt`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.manage_system_prompt).

**`allowed_file_url_schemes`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset({'http', 'https'})`

URL schemes allowed for file URL parts from the client. See [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes).

### SSE\_CONTENT\_TYPE

Content type header value for Server-Sent Events (SSE).

**Default:** `'text/event-stream'`

### OnCompleteFunc

Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]]`

---

# [pydantic_ai.agent](https://pydantic.dev/docs/ai/api/pydantic-ai/agent/)

# pydantic\_ai.agent

### Agent

**Bases:** `AbstractAgent[AgentDepsT, OutputDataT]`

Class for defining "agents" - a way to have a specific type of "conversation" with an LLM.

Agents are generic in the dependency type they take [`AgentDepsT`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentDepsT) and the output type they return, [`OutputDataT`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.OutputDataT).

By default, if neither generic parameter is customised, agents have type `Agent[None, str]`.

Minimal usage example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

#### Attributes

##### end\_strategy

The strategy for handling multiple tool calls when a final result is found.

-   `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped
-   `'graceful'`: Output tools are executed first. Once a valid final result is found, remaining output tool calls are skipped, but function tools are still executed
-   `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output

**Type:** [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) **Default:** `end_strategy`

##### model\_settings

Optional model request settings to use for this agent's runs, by default.

Can be a static `ModelSettings` dict or a callable that takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns `ModelSettings`. Callables are called before each model request, allowing dynamic per-step settings.

Note, if `model_settings` is also provided at run time, those settings will be merged on top of the agent-level settings, with the run-level argument taking priority.

**Type:** `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `model_settings`

##### instrument

Options to automatically instrument with OpenTelemetry.

**Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `instrument`

##### model

The default model configured for this agent.

**Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### name

The name of the agent, used for logging.

If `None`, we try to infer the agent name from the call frame when the agent is first run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### description

A human-readable description of the agent.

If the description is a TemplateStr, returns the raw template source. The rendered description is available at runtime via OTel span attributes.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### deps\_type

The type of dependencies used by the agent.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)

##### output\_type

The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.

**Type:** `OutputSpec`\[`OutputDataT`\]

##### event\_stream\_handler

Optional handler for events from the model's streaming response and the agent's execution of tools.

**Type:** `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### root\_capability

The root capability of the agent, containing all registered capabilities.

**Type:** `CombinedCapability`\[`AgentDepsT`\]

##### toolsets

All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly.

Output tools are not included.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\]

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[OutputDataT] = str,
    instructions: AgentInstructions[AgentDepsT] = None,
    system_prompt: str | Sequence[str] = (),
    deps_type: type[AgentDepsT] = NoneType,
    name: str | None = None,
    description: TemplateStr[AgentDepsT] | str | None = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    retries: int | None = None,
    tool_retries: int | None = None,
    validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] = (),
    prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None,
    prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None,
    toolsets: Sequence[AgentToolset[AgentDepsT]] | None = None,
    defer_model_check: bool = False,
    end_strategy: EndStrategy = 'early',
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
) -> None
def __init__(
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[OutputDataT] = str,
    instructions: AgentInstructions[AgentDepsT] = None,
    system_prompt: str | Sequence[str] = (),
    deps_type: type[AgentDepsT] = NoneType,
    name: str | None = None,
    description: TemplateStr[AgentDepsT] | str | None = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    retries: int | None = None,
    tool_retries: int | None = None,
    validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] = (),
    prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None,
    prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None,
    mcp_servers: Sequence[MCPServer] = (),
    defer_model_check: bool = False,
    end_strategy: EndStrategy = 'early',
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
) -> None
```

Create an agent.

###### Parameters

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently.

**`output_type`** : `OutputSpec`\[`OutputDataT`\] _Default:_ `str`

The type of the output data, used to validate the data returned by the model, defaults to `str`.

**`instructions`** : `AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Instructions to use for this agent, you can also register instructions via a function with [`instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions) or pass additional, temporary, instructions when executing a run.

**`system_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `()`

Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt).

**`deps_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AgentDepsT`\] _Default:_ `NoneType`

The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, <return type>]`.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run.

**`description`** : `TemplateStr`\[`AgentDepsT`\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A human-readable description of the agent, attached to the agent run span as `gen_ai.agent.description` when instrumentation is enabled.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model request settings to use for this agent's runs, by default. Can be a static `ModelSettings` dict or a callable that takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns `ModelSettings`. Callables are called before each model request, allowing dynamic per-step settings.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Deprecated alias for `tool_retries`. In 1.x this also still cascades to `output_retries` (when `output_retries` is unset) for backward compatibility, with a `DeprecationWarning`. In v2 it will be removed and the cascade will go away -- pass `output_retries` explicitly if you depend on the cascade. For model request retries, see the [HTTP Request Retries](/docs/ai/advanced-features/retries) documentation.

**`tool_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The default number of retries to allow for tool calls before raising an error. Defaults to 1.

**`validation_context`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `None`

Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) used to validate tool arguments and outputs.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum number of retries for output validation. Defaults to 1. On the text path this is a global budget shared across all output-validation retries in a run; on the tool path this is the default per-tool `max_retries` for each output tool, overridable via [`ToolOutput(max_retries=...)`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput.max_retries). Can also be overridden per run via `agent.run(output_retries=...)` (and friends).

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] _Default:_ `()`

Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) and [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain).

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] _Default:_ `()`

The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised.

**`prepare_tools`** : `ToolsPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc)

**`prepare_output_tools`** : `ToolsPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc)

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AgentToolset`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ToolsetFunc) for more information.

**`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

by default, if you provide a [named](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) model, it's evaluated to create a [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) for testing.

**`end_strategy`** : [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) _Default:_ `'early'`

Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) for more information.

**`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all) will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to store with each run. Provide a dictionary of primitives, or a callable returning one computed from the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) on each run. Metadata is resolved when a run starts and recomputed after a successful run finishes so it can reflect the final state. Resolved metadata can be read after the run completes via [`AgentRun.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), [`AgentRunResult.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult), and [`StreamedRunResult.metadata`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), and is attached to the agent run span when instrumentation is enabled.

**`history_processors`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`HistoryProcessor`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional handler for events from the model's streaming response and the agent's execution of tools.

**`tool_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default timeout in seconds for tool execution. If a tool takes longer than this, the tool is considered to have failed and a retry prompt is returned to the model (counting towards the retry limit). Individual tools can override this with their own timeout. Defaults to None (no timeout).

**`max_concurrency`** : `_concurrency.AnyConcurrencyLimit` _Default:_ `None`

Optional limit on concurrent agent runs. Can be an integer for simple limiting, a [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) for advanced configuration with backpressure, a [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) for sharing limits across multiple agents, or None (default) for no limiting. When the limit is reached, additional calls to `run()` or `iter()` will wait until a slot becomes available.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional list of [capabilities](https://ai.pydantic.dev/capabilities/) to configure the agent with, including functions which take a run context and return a capability. See [`CapabilityFunc`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityFunc) for more information. Custom capabilities can be created by subclassing [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability).

##### from\_spec

`@classmethod`

```python
def from_spec(
    cls,
    spec: dict[str, Any] | AgentSpec,
    custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (),
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[Any] = str,
    instructions: AgentInstructions[Any] = None,
    system_prompt: str | Sequence[str] = (),
    name: str | None = None,
    description: TemplateStr[Any] | str | None = None,
    model_settings: ModelSettings | None = None,
    retries: int | None = None,
    validation_context: Any = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[Any]] = (),
    prepare_tools: ToolsPrepareFunc[Any] | None = None,
    prepare_output_tools: ToolsPrepareFunc[Any] | None = None,
    toolsets: Sequence[AgentToolset[Any]] | None = None,
    defer_model_check: bool = False,
    end_strategy: EndStrategy | None = None,
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[Any] | None = None,
    history_processors: Sequence[HistoryProcessor[Any]] | None = None,
    event_stream_handler: EventStreamHandler[Any] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[Any]] | None = None,
) -> Agent[None, str]
def from_spec(
    cls,
    spec: dict[str, Any] | AgentSpec,
    deps_type: type[T],
    custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (),
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[Any] = str,
    instructions: AgentInstructions[Any] = None,
    system_prompt: str | Sequence[str] = (),
    name: str | None = None,
    description: TemplateStr[Any] | str | None = None,
    model_settings: ModelSettings | None = None,
    retries: int | None = None,
    validation_context: Any = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[Any]] = (),
    prepare_tools: ToolsPrepareFunc[Any] | None = None,
    prepare_output_tools: ToolsPrepareFunc[Any] | None = None,
    toolsets: Sequence[AgentToolset[Any]] | None = None,
    defer_model_check: bool = False,
    end_strategy: EndStrategy | None = None,
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[Any] | None = None,
    history_processors: Sequence[HistoryProcessor[Any]] | None = None,
    event_stream_handler: EventStreamHandler[Any] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[Any]] | None = None,
) -> Agent[T, str]
```

Construct an Agent from a spec dict or `AgentSpec`.

This allows defining agents declaratively in YAML/JSON/dict form. Keyword arguments supplement the spec: scalar spec fields (like `name`, `retries`) are used as defaults that explicit arguments override, while `capabilities` from both sources are merged.

###### Returns

[`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- A new Agent instance.

###### Parameters

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec`

The agent specification, either a dict or an `AgentSpec` instance.

**`deps_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `type(None)`

The type of the dependencies for the agent. When provided, template strings in capabilities (e.g. `"Hello {{name}}"`) are compiled and validated against this type.

**`custom_capability_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] _Default:_ `()`

Additional capability classes to make available beyond the built-in defaults.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the model from the spec.

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `str`

The type of the output data, defaults to `str`.

**`instructions`** : `AgentInstructions`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] _Default:_ `None`

Instructions for the agent.

**`system_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `()`

Static system prompts.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The agent name, overrides spec `name` if provided.

**`description`** : `TemplateStr`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The agent description, overrides spec `description` if provided.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model request settings.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default retries for tool calls and output validation, overrides spec `retries` if provided.

**`validation_context`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) _Default:_ `None`

Pydantic validation context for tool arguments and outputs.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Max retries for output validation, overrides spec `output_retries` if provided.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `ToolFuncEither`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), ...\]\] _Default:_ `()`

Tools to register with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] _Default:_ `()`

Builtin tools for the agent.

**`prepare_tools`** : `ToolsPrepareFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom function to prepare tool definitions.

**`prepare_output_tools`** : `ToolsPrepareFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom function to prepare output tool definitions.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AgentToolset`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Toolsets to register with the agent.

**`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Defer model evaluation until first run.

**`end_strategy`** : [`EndStrategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EndStrategy) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Strategy for tool calls alongside a final result, overrides spec `end_strategy` if provided.

**`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Instrumentation settings, overrides spec `instrument` if provided.

**`metadata`** : `AgentMetadata`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Metadata to store with each run, overrides spec `metadata` if provided.

**`history_processors`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`HistoryProcessor`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Processors for message history.

**`event_stream_handler`** : `EventStreamHandler`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Handler for streaming events.

**`tool_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default timeout for tool execution, overrides spec `tool_timeout` if provided.

**`max_concurrency`** : `_concurrency.AnyConcurrencyLimit` _Default:_ `None`

Limit on concurrent agent runs.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Additional capabilities merged with those from the spec.

##### from\_file

`@classmethod`

```python
def from_file(
    cls,
    path: Path | str,
    fmt: Literal['yaml', 'json'] | None = None,
    custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (),
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[Any] = str,
    instructions: AgentInstructions[Any] = None,
    system_prompt: str | Sequence[str] = (),
    name: str | None = None,
    description: TemplateStr[Any] | str | None = None,
    model_settings: ModelSettings | None = None,
    retries: int | None = None,
    validation_context: Any = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[Any]] = (),
    prepare_tools: ToolsPrepareFunc[Any] | None = None,
    prepare_output_tools: ToolsPrepareFunc[Any] | None = None,
    toolsets: Sequence[AgentToolset[Any]] | None = None,
    defer_model_check: bool = False,
    end_strategy: EndStrategy | None = None,
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[Any] | None = None,
    history_processors: Sequence[HistoryProcessor[Any]] | None = None,
    event_stream_handler: EventStreamHandler[Any] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[Any]] | None = None,
) -> Agent[None, str]
def from_file(
    cls,
    path: Path | str,
    fmt: Literal['yaml', 'json'] | None = None,
    deps_type: type[T],
    custom_capability_types: Sequence[type[AbstractCapability[Any]]] = (),
    model: models.Model | models.KnownModelName | str | None = None,
    output_type: OutputSpec[Any] = str,
    instructions: AgentInstructions[Any] = None,
    system_prompt: str | Sequence[str] = (),
    name: str | None = None,
    description: TemplateStr[Any] | str | None = None,
    model_settings: ModelSettings | None = None,
    retries: int | None = None,
    validation_context: Any = None,
    output_retries: int | None = None,
    tools: Sequence[Tool[Any] | ToolFuncEither[Any, ...]] = (),
    builtin_tools: Sequence[AgentBuiltinTool[Any]] = (),
    prepare_tools: ToolsPrepareFunc[Any] | None = None,
    prepare_output_tools: ToolsPrepareFunc[Any] | None = None,
    toolsets: Sequence[AgentToolset[Any]] | None = None,
    defer_model_check: bool = False,
    end_strategy: EndStrategy | None = None,
    instrument: InstrumentationSettings | bool | None = None,
    metadata: AgentMetadata[Any] | None = None,
    history_processors: Sequence[HistoryProcessor[Any]] | None = None,
    event_stream_handler: EventStreamHandler[Any] | None = None,
    tool_timeout: float | None = None,
    max_concurrency: _concurrency.AnyConcurrencyLimit = None,
    capabilities: Sequence[AgentCapability[Any]] | None = None,
) -> Agent[T, str]
```

Construct an Agent from a YAML or JSON spec file.

This is a convenience method equivalent to `Agent.from_spec(AgentSpec.from_file(path), ...)`.

The file format is inferred from the extension (`.yaml`/`.yml` or `.json`) unless overridden with the `fmt` argument.

All other arguments are forwarded to [`from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec).

###### Returns

[`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### instrument\_all

`@staticmethod`

```python
def instrument_all(instrument: InstrumentationSettings | bool = True) -> None
```

Set the instrumentation options for all agents where `instrument` is not set.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### iter

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### override

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    metadata: AgentMetadata[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent. Note: this also replaces capability-contributed instructions (e.g. from [`get_instructions`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions)).

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The metadata to use instead of the metadata passed to the agent constructor. When set, any per-run `metadata` argument is ignored.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec providing defaults for override. Explicit params take precedence over spec values. When the spec includes `capabilities`, they replace (not merge with) the agent's existing capabilities. To add capabilities without replacing, pass `spec` to `run()` or `iter()` instead.

##### instructions

```python
def instructions(
    func: Callable[[RunContext[AgentDepsT]], str | None],
) -> Callable[[RunContext[AgentDepsT]], str | None]
def instructions(
    func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]],
) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]]
def instructions(func: Callable[[], str | None]) -> Callable[[], str | None]
def instructions(
    func: Callable[[], Awaitable[str | None]],
) -> Callable[[], Awaitable[str | None]]
def instructions(

) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]
```

Decorator to register an instructions function.

Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument. Can decorate a sync or async functions.

The decorator can be used bare (`agent.instructions`).

Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function.

Example:

```python
from pydantic_ai import Agent, RunContext

agent = Agent('test', deps_type=str)

@agent.instructions
def simple_instructions() -> str:
    return 'foobar'

@agent.instructions
async def async_instructions(ctx: RunContext[str]) -> str:
    return f'{ctx.deps} is the best'
```

###### Returns

[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\], `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\] | `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]

##### system\_prompt\_parts

`@async`

```python
def system_prompt_parts(
    deps: AgentDepsT = None,
    model: models.Model | models.KnownModelName | str | None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    prompt: str | Sequence[_messages.UserContent] | None = None,
    usage: _usage.RunUsage | None = None,
    model_settings: ModelSettings | None = None,
) -> list[_messages.SystemPromptPart]
```

Resolve the agent's configured system prompts into `SystemPromptPart`s.

See [`AbstractAgent.system_prompt_parts`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.system_prompt_parts).

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.SystemPromptPart`\]

##### system\_prompt

```python
def system_prompt(
    func: Callable[[RunContext[AgentDepsT]], str | None],
) -> Callable[[RunContext[AgentDepsT]], str | None]
def system_prompt(
    func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]],
) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]]
def system_prompt(func: Callable[[], str | None]) -> Callable[[], str | None]
def system_prompt(
    func: Callable[[], Awaitable[str | None]],
) -> Callable[[], Awaitable[str | None]]
def system_prompt(
    dynamic: bool = False,
) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]
```

Decorator to register a system prompt function.

Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument. Can decorate a sync or async functions.

The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below.

Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests.

Example:

```python
from pydantic_ai import Agent, RunContext

agent = Agent('test', deps_type=str)

@agent.system_prompt
def simple_system_prompt() -> str:
    return 'foobar'

@agent.system_prompt(dynamic=True)
async def async_system_prompt(ctx: RunContext[str]) -> str:
    return f'{ctx.deps} is the best'
```

###### Returns

[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\], `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]\] | `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\]

###### Parameters

**`func`** : `_system_prompt.SystemPromptFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The function to decorate

**`dynamic`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart.dynamic_ref)

##### output\_validator

```python
def output_validator(
    func: Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT],
) -> Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT]
def output_validator(
    func: Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]],
) -> Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]]
def output_validator(
    func: Callable[[OutputDataT], OutputDataT],
) -> Callable[[OutputDataT], OutputDataT]
def output_validator(
    func: Callable[[OutputDataT], Awaitable[OutputDataT]],
) -> Callable[[OutputDataT], Awaitable[OutputDataT]]
```

Decorator to register an output validator function.

Optionally takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. Can decorate a sync or async functions.

Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests.

Example:

```python
from pydantic_ai import Agent, ModelRetry, RunContext

agent = Agent('test', deps_type=str)

@agent.output_validator
def output_validator_simple(data: str) -> str:
    if 'wrong' in data:
        raise ModelRetry('wrong response')
    return data

@agent.output_validator
async def output_validator_deps(ctx: RunContext[str], data: str) -> str:
    if ctx.deps in data:
        raise ModelRetry('wrong response')
    return data

result = agent.run_sync('foobar', deps='spam')
print(result.output)
#> success (no tool calls)
```

###### Returns

`_output.OutputValidatorFunc`\[`AgentDepsT`, `OutputDataT`\]

##### tool

```python
def tool(
    func: ToolFuncContext[AgentDepsT, ToolParams],
) -> ToolFuncContext[AgentDepsT, ToolParams]
def tool(
    name: str | None = None,
    description: str | None = None,
    retries: int | None = None,
    prepare: ToolPrepareFunc[AgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat = 'auto',
    require_parameter_descriptions: bool = False,
    schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema,
    strict: bool | None = None,
    sequential: bool = False,
    requires_approval: bool = False,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    defer_loading: bool = False,
    include_return_schema: bool | None = None,
) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]]
```

Decorator to register a tool function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument.

Can decorate a sync or async functions.

The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema).

We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured.

Example:

```python
from pydantic_ai import Agent, RunContext

agent = Agent('test', deps_type=int)

@agent.tool
def foobar(ctx: RunContext[int], x: int) -> int:
    return ctx.deps + x

@agent.tool(retries=2)
async def spam(ctx: RunContext[str], y: float) -> float:
    return ctx.deps + y

result = agent.run_sync('foobar', deps=1)
print(result.output)
#> {"foobar":1,"spam":1.0}
```

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`func`** : `ToolFuncContext`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The tool function to register.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the tool, defaults to the function name.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The description of the tool, defaults to the function docstring.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1.

**`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` _Default:_ `'auto'`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If True, raise an error if a parameter description is missing. Defaults to False.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema`

The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the function requires a sequential/serial execution environment. Defaults to False.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout).

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

##### tool\_plain

```python
def tool_plain(func: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]
def tool_plain(
    name: str | None = None,
    description: str | None = None,
    retries: int | None = None,
    prepare: ToolPrepareFunc[AgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat = 'auto',
    require_parameter_descriptions: bool = False,
    schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema,
    strict: bool | None = None,
    sequential: bool = False,
    requires_approval: bool = False,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    defer_loading: bool = False,
    include_return_schema: bool | None = None,
) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]]
```

Decorator to register a tool function which DOES NOT take `RunContext` as an argument.

Can decorate a sync or async functions.

The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema).

We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured.

Example:

```python
from pydantic_ai import Agent, RunContext

agent = Agent('test')

@agent.tool
def foobar(ctx: RunContext[int]) -> int:
    return 123

@agent.tool(retries=2)
async def spam(ctx: RunContext[str]) -> float:
    return 3.14

result = agent.run_sync('foobar', deps=1)
print(result.output)
#> {"foobar":123,"spam":3.14}
```

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`func`** : `ToolFuncPlain`\[`ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The tool function to register.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the tool, defaults to the function name.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The description of the tool, defaults to the function docstring.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1.

**`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument -- even though the tool function itself does not take `RunContext` when using `tool_plain`. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` _Default:_ `'auto'`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If True, raise an error if a parameter description is missing. Defaults to False.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema`

The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the function requires a sequential/serial execution environment. Defaults to False.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout).

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

##### toolset

```python
def toolset(func: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]
def toolset(
    per_run_step: bool = True,
    id: str | None = None,
) -> Callable[[ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT]]
```

Decorator to register a toolset function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its only argument.

Can decorate a sync or async functions.

The decorator can be used bare (`agent.toolset`).

Example:

```python
from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext

agent = Agent('test', deps_type=str)

@agent.toolset
async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]:
    return FunctionToolset()
```

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`func`** : [`ToolsetFunc`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ToolsetFunc)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The toolset function to register.

**`per_run_step`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to re-evaluate the toolset for each run step. Defaults to True.

**`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An optional unique ID for the dynamic toolset. Required for use with durable execution environments like Temporal, where the ID identifies the toolset's activities within the workflow.

##### \_\_aenter\_\_

`@async`

```python
def __aenter__() -> Self
```

Enter the agent context.

This will start all [`MCPServerStdio`s](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) registered as `toolsets` so they are ready to be used, and enter the model so the provider's HTTP client will be closed cleanly on exit.

This is a no-op if the agent has already been entered.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

##### set\_mcp\_sampling\_model

```python
def set_mcp_sampling_model(
    model: models.Model | models.KnownModelName | str | None = None,
) -> None
```

Set the sampling model on all MCP servers registered with the agent.

If no sampling model is provided, the agent's model will be used.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### to\_web

```python
def to_web(
    models: ModelsParam = None,
    builtin_tools: list[AbstractBuiltinTool] | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    instructions: str | None = None,
    html_source: str | Path | None = None,
) -> Starlette
```

Create a Starlette app that serves a web chat UI for this agent.

This method returns a pre-configured Starlette application that provides a web-based chat interface for interacting with the agent. By default, the UI is fetched from a CDN and cached on first use.

The returned Starlette application can be mounted into a FastAPI app or run directly with any ASGI server (uvicorn, hypercorn, etc.).

Note that the `deps` and `model_settings` will be the same for each request. To provide different `deps` for each request use the lower-level adapters directly.

###### Returns

`Starlette` -- A configured Starlette application ready to be served (e.g., with uvicorn)

###### Parameters

**`models`** : `ModelsParam` _Default:_ `None`

Additional models to make available in the UI. Can be:

-   A sequence of model names/instances (e.g., `['openai:gpt-5', 'anthropic:claude-sonnet-4-6']`)
-   A dict mapping display labels to model names/instances (e.g., `{'GPT 5': 'openai:gpt-5', 'Claude': 'anthropic:claude-sonnet-4-6'}`) The agent's model is always included. Builtin tool support is automatically determined from each model's profile.

**`builtin_tools`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Additional builtin tools to make available in the UI. The agent's configured builtin tools are always included. Tool labels in the UI are derived from the tool's `label` property.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for all requests.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for all model requests.

**`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional extra instructions to pass to each agent run.

**`html_source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Path or URL for the chat UI HTML. Can be:

-   None (default): Fetches from CDN and caches locally
-   A Path instance: Reads from the local file
-   A URL string (http:// or https://): Fetches from the URL
-   A file path string: Reads from the local file

##### run\_mcp\_servers

`@async`

`@deprecated`

```python
def run_mcp_servers(
    model: models.Model | models.KnownModelName | str | None = None,
) -> AsyncIterator[None]
```

Run [`MCPServerStdio`s](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) so they can be used by the agent.

Deprecated: use [`async with agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__aenter__) instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.set_mcp_sampling_model).

Returns: a context manager to start and shutdown the servers.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

### AbstractAgent

**Bases:** `Generic[AgentDepsT, OutputDataT]`, `ABC`

Abstract superclass for [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent), [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent), and your own custom agent implementations.

#### Attributes

##### model

The default model configured for this agent.

**Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### name

The name of the agent, used for logging.

If `None`, we try to infer the agent name from the call frame when the agent is first run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### description

A human-readable description of the agent.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### deps\_type

The type of dependencies used by the agent.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)

##### output\_type

The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.

**Type:** `OutputSpec`\[`OutputDataT`\]

##### event\_stream\_handler

Optional handler for events from the model's streaming response and the agent's execution of tools.

**Type:** `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### root\_capability

The root capability of the agent, containing all registered capabilities.

**Type:** `CombinedCapability`\[`AgentDepsT`\]

##### toolsets

All toolsets registered on the agent.

Output tools are not included.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\]

#### Methods

##### system\_prompt\_parts

`@async`

```python
def system_prompt_parts(
    deps: AgentDepsT = None,
    model: models.Model | models.KnownModelName | str | None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    prompt: str | Sequence[_messages.UserContent] | None = None,
    usage: _usage.RunUsage | None = None,
    model_settings: ModelSettings | None = None,
) -> list[_messages.SystemPromptPart]
```

Resolve the agent's configured system prompts into `SystemPromptPart`s.

Returns a list suitable for prepending to a `ModelRequest`. Static strings and runners decorated with [`@agent.system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt) are evaluated using a minimal `RunContext` built from the provided kwargs -- useful when reconstructing a `message_history` that should carry the agent's configured system prompt (e.g. in UI adapters or after history compaction).

Dynamic runners produce parts with `dynamic_ref` set so they can continue to be re-evaluated by the standard agent graph path on subsequent turns.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.SystemPromptPart`\]

###### Parameters

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies for dynamic system prompt functions.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for `RunContext.model`. Falls back to the agent's configured model; required only if the agent has no model set.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional message history to expose as `RunContext.messages`.

**`prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional user prompt to expose as `RunContext.prompt`.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to expose as `RunContext.usage`.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to expose as `RunContext.model_settings`.

##### output\_json\_schema

```python
def output_json_schema(
    output_type: OutputSpec[OutputDataT | RunOutputDataT] | None = None,
) -> JsonSchema
```

The output return JSON schema.

###### Returns

`JsonSchema`

##### run

`@async`

```python
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Run the agent with a user prompt in async mode.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    agent_run = await agent.run('What is the capital of France?')
    print(agent_run.output)
    #> The capital of France is Paris.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### run\_sync

```python
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Synchronously run the agent with a user prompt.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.output)
#> The capital of Italy is Rome.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### run\_stream

`@async`

```python
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, OutputDataT]]
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, RunOutputDataT]]
```

Run the agent with a user prompt in async streaming mode.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage.

As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with an `event_stream_handler` or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_output())
        #> The capital of the UK is London.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`result.StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### run\_stream\_sync

```python
def run_stream_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> result.StreamedRunResultSync[AgentDepsT, OutputDataT]
def run_stream_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> result.StreamedRunResultSync[AgentDepsT, RunOutputDataT]
```

Run the agent with a user prompt in sync streaming mode.

This is a convenience method that wraps [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage.

As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with an `event_stream_handler` or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

def main():
    response = agent.run_stream_sync('What is the capital of the UK?')
    print(response.get_output())
    #> The capital of the UK is London.
```

###### Returns

[`result.StreamedRunResultSync`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### run\_stream\_events

```python
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[OutputDataT]
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[RunOutputDataT]
```

Run the agent with a user prompt in async mode and stream events from the run.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run.

Example:

```python
from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent

agent = Agent('openai:gpt-5.2')

async def main():
    events: list[AgentStreamEvent | AgentRunResultEvent] = []
    async with agent.run_stream_events('What is the capital of France?') as stream:
        async for event in stream:
            events.append(event)
    print(events)
    '''
    [
        PartStartEvent(index=0, part=TextPart(content='The capital of ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')),
        PartEndEvent(
            index=0, part=TextPart(content='The capital of France is Paris. ')
        ),
        AgentRunResultEvent(
            result=AgentRunResult(output='The capital of France is Paris. ')
        ),
    ]
    '''
```

Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed.

###### Returns

`AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- An `AgentEventStream` async context manager yielding stream events `AgentStreamEvent` and finally a `AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- `AgentRunResultEvent` with the final run result.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### iter

`@abstractmethod`

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns settings. Callables are called before each model request, allowing dynamic per-step settings.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run. At run time, spec values are additive.

##### override

`@abstractmethod`

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec providing defaults for override.

##### parallel\_tool\_call\_execution\_mode

`@staticmethod`

```python
def parallel_tool_call_execution_mode(
    mode: tool_manager.ParallelExecutionMode = 'parallel',
) -> Iterator[None]
```

Set the parallel execution mode during the context.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`mode`** : `tool_manager.ParallelExecutionMode` _Default:_ `'parallel'`

The execution mode for tool calls:

-   'parallel': Run tool calls in parallel, yielding events as they complete (default).
-   'sequential': Run tool calls one at a time in order.
-   'parallel\_ordered\_events': Run tool calls in parallel, but events are emitted in order, after all calls complete.

##### sequential\_tool\_calls

`@deprecated`

`@staticmethod`

```python
def sequential_tool_calls() -> Iterator[None]
```

Run tool calls sequentially during the context.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

##### using\_thread\_executor

`@staticmethod`

```python
def using_thread_executor(executor: Executor) -> Iterator[None]
```

Use a custom executor for running sync functions in threads during the context.

By default, sync tool functions and other sync callbacks are run in threads using `anyio.to_thread.run_sync`, which creates ephemeral threads. In long-running servers (e.g. FastAPI), this can lead to thread accumulation under sustained load.

This context manager lets you provide a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) (or any [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor)) to control thread lifecycle:

```python
from concurrent.futures import ThreadPoolExecutor
from contextlib import asynccontextmanager

from pydantic_ai import Agent

@asynccontextmanager
async def lifespan(app):
    executor = ThreadPoolExecutor(max_workers=16)
    with Agent.using_thread_executor(executor):
        yield
    executor.shutdown(wait=True)
```

For per-agent configuration, use the [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability instead.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`executor`** : `Executor`

The executor to use for running sync functions.

##### is\_model\_request\_node

`@staticmethod`

```python
def is_model_request_node(
    node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]],
) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]
```

Check if the node is a `ModelRequestNode`, narrowing the type if it is.

This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`.

###### Returns

[`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.ModelRequestNode`\[`T`, `S`\]\]

##### is\_call\_tools\_node

`@staticmethod`

```python
def is_call_tools_node(
    node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]],
) -> TypeIs[_agent_graph.CallToolsNode[T, S]]
```

Check if the node is a `CallToolsNode`, narrowing the type if it is.

This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`.

###### Returns

[`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.CallToolsNode`\[`T`, `S`\]\]

##### is\_user\_prompt\_node

`@staticmethod`

```python
def is_user_prompt_node(
    node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]],
) -> TypeIs[_agent_graph.UserPromptNode[T, S]]
```

Check if the node is a `UserPromptNode`, narrowing the type if it is.

This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`.

###### Returns

[`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`_agent_graph.UserPromptNode`\[`T`, `S`\]\]

##### is\_end\_node

`@staticmethod`

```python
def is_end_node(
    node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]],
) -> TypeIs[End[result.FinalResult[S]]]
```

Check if the node is a `End`, narrowing the type if it is.

This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`.

###### Returns

[`TypeIs`](https://docs.python.org/3/library/typing.html#typing.TypeIs)\[`End`\[`result.FinalResult`\[`S`\]\]\]

##### to\_ag\_ui

```python
def to_ag_ui(
    output_type: OutputSpec[OutputDataT] | None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    debug: bool = False,
    routes: Sequence[BaseRoute] | None = None,
    middleware: Sequence[Middleware] | None = None,
    exception_handlers: Mapping[Any, ExceptionHandler] | None = None,
    on_startup: Sequence[Callable[[], Any]] | None = None,
    on_shutdown: Sequence[Callable[[], Any]] | None = None,
    lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None,
) -> AGUIApp[AgentDepsT, OutputDataT]
```

Returns an ASGI application that handles every AG-UI request by running the agent.

Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.run_ag_ui) or [`pydantic_ai.ag_ui.handle_ag_ui_request`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.handle_ag_ui_request) instead.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')
app = agent.to_ag_ui()
```

The `app` is an ASGI application that can be used with any ASGI server.

To run the application, you can use the following command:

Terminal

```bash
uvicorn app:app --host 0.0.0.0 --port 8000
```

See [AG-UI docs](/docs/ai/integrations/ui/ag-ui) for more information.

###### Returns

`AGUIApp`\[`AgentDepsT`, `OutputDataT`\] -- An ASGI application for running Pydantic AI agents with AG-UI protocol support.

###### Parameters

**`output_type`** : `OutputSpec`\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Boolean indicating if debug tracebacks should be returned on errors.

**`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of routes to serve incoming HTTP and WebSocket requests.

**`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints.

**`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions.

**`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions.

**`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions.

**`lifespan`** : `Lifespan`\[`AGUIApp`\[`AgentDepsT`, `OutputDataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both.

##### to\_a2a

```python
def to_a2a(
    storage: Storage | None = None,
    broker: Broker | None = None,
    name: str | None = None,
    url: str = 'http://localhost:8000',
    version: str = '1.0.0',
    description: str | None = None,
    provider: AgentProvider | None = None,
    skills: list[Skill] | None = None,
    debug: bool = False,
    routes: Sequence[Route] | None = None,
    middleware: Sequence[Middleware] | None = None,
    exception_handlers: dict[Any, ExceptionHandler] | None = None,
    lifespan: Lifespan[FastA2A] | None = None,
) -> FastA2A
```

Convert the agent to a FastA2A application.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')
app = agent.to_a2a()
```

The `app` is an ASGI application that can be used with any ASGI server.

To run the application, you can use the following command:

Terminal

```bash
uvicorn app:app --host 0.0.0.0 --port 8000
```

###### Returns

`FastA2A`

##### to\_cli

`@async`

```python
def to_cli(
    deps: AgentDepsT = None,
    prog_name: str = 'pydantic-ai',
    message_history: Sequence[_messages.ModelMessage] | None = None,
    model_settings: ModelSettings | None = None,
    usage_limits: _usage.UsageLimits | None = None,
) -> None
```

Run the agent in a CLI chat interface.

Example:

agent\_to\_cli.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.')

async def main():
    await agent.to_cli()
```

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`deps`** : `AgentDepsT` _Default:_ `None`

The dependencies to pass to the agent.

**`prog_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'pydantic-ai'`

The name of the program to use for the CLI. Defaults to 'pydantic-ai'.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

##### to\_cli\_sync

```python
def to_cli_sync(
    deps: AgentDepsT = None,
    prog_name: str = 'pydantic-ai',
    message_history: Sequence[_messages.ModelMessage] | None = None,
    model_settings: ModelSettings | None = None,
    usage_limits: _usage.UsageLimits | None = None,
) -> None
```

Run the agent in a CLI chat interface with the non-async interface.

agent\_to\_cli\_sync.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.')
agent.to_cli_sync()
agent.to_cli_sync(prog_name='assistant')
```

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`deps`** : `AgentDepsT` _Default:_ `None`

The dependencies to pass to the agent.

**`prog_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'pydantic-ai'`

The name of the program to use for the CLI. Defaults to 'pydantic-ai'.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

### WrapperAgent

**Bases:** `AbstractAgent[AgentDepsT, OutputDataT]`

Agent which wraps another agent.

Does nothing on its own, used as a base class.

#### Methods

##### iter

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### override

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply as overrides.

### AgentRun

**Bases:** `Generic[AgentDepsT, OutputDataT]`

A stateful, async-iterable run of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent).

You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`.

Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) is reached, the run finishes and [`result`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.result) becomes available.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    # Iterate through the run, recording each node along the way:
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

You can also manually drive the iteration using the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method for more granular control.

#### Attributes

##### ctx

The current context of the agent run.

**Type:** `GraphRunContext`\[`_agent_graph.GraphAgentState`, `_agent_graph.GraphAgentDeps`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

##### next\_node

The next node that will be run in the agent graph.

This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`.

**Type:** `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]

##### result

The final result of the run if it has ended, otherwise `None`.

Once the run returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node, `result` is populated with an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult).

**Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### all\_messages

```python
def all_messages() -> list[_messages.ModelMessage]
```

Return all messages for the run so far.

Messages from older runs are included.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

##### new\_messages

```python
def new_messages() -> list[_messages.ModelMessage]
```

Return the messages produced during this run so far.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]

##### new\_messages\_json

```python
def new_messages_json() -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

##### \_\_aiter\_\_

```python
def __aiter__(

) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]
```

Provide async-iteration over the nodes in the agent run.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]\]

##### \_\_anext\_\_

`@async`

```python
def __anext__(

) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]
```

Advance to the next node automatically based on the last returned node.

Note: this uses the graph run's internal iteration which does NOT call node hooks (`before_node_run`, `wrap_node_run`, `after_node_run`, `on_node_run_error`). Use `next()` for capability-hooked iteration, or use `agent.run()` which drives via `next()` automatically.

###### Returns

`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]

##### next

`@async`

```python
def next(
    node: _agent_graph.AgentNode[AgentDepsT, OutputDataT],
) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]
```

Manually drive the agent run by passing in the node you want to run next.

This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node.

Example:

```python
from pydantic_ai import Agent
from pydantic_graph import End

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.iter('What is the capital of France?') as agent_run:
        next_node = agent_run.next_node  # start with the first node
        nodes = [next_node]
        while not isinstance(next_node, End):
            next_node = await agent_run.next(next_node)
            nodes.append(next_node)
        # Once `next_node` is an End, we've finished:
        print(nodes)
        '''
        [
            UserPromptNode(
                user_prompt='What is the capital of France?',
                instructions_functions=[],
                system_prompts=(),
                system_prompt_functions=[],
                system_prompt_dynamic_functions={},
            ),
            ModelRequestNode(
                request=ModelRequest(
                    parts=[
                        UserPromptPart(
                            content='What is the capital of France?',
                            timestamp=datetime.datetime(...),
                        )
                    ],
                    timestamp=datetime.datetime(...),
                    run_id='...',
                    conversation_id='...',
                )
            ),
            CallToolsNode(
                model_response=ModelResponse(
                    parts=[TextPart(content='The capital of France is Paris.')],
                    usage=RequestUsage(input_tokens=56, output_tokens=7),
                    model_name='gpt-5.2',
                    timestamp=datetime.datetime(...),
                    run_id='...',
                    conversation_id='...',
                )
            ),
            End(data=FinalResult(output='The capital of France is Paris.')),
        ]
        '''
        print('Final result:', agent_run.result.output)
        #> Final result: The capital of France is Paris.
```

###### Returns

`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- the run has completed.

###### Parameters

**`node`** : `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\]

The node to run next in the graph.

##### usage

```python
def usage() -> _usage.RunUsage
```

Get usage statistics for the run so far, including token usage, model requests, and so on.

###### Returns

`_usage.RunUsage`

### AgentRunResult

**Bases:** `Generic[OutputDataT]`

The final result of an agent run.

#### Attributes

##### output

The output data from the agent run.

**Type:** `OutputDataT`

##### response

Return the last response from the message history.

**Type:** `_messages.ModelResponse`

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### all\_messages

```python
def all_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the history of \_messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages

```python
def new_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the messages produced during this run.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages\_json

```python
def new_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### usage

```python
def usage() -> _usage.RunUsage
```

Return the usage of the whole run.

###### Returns

`_usage.RunUsage`

##### timestamp

```python
def timestamp() -> datetime
```

Return the timestamp of last response.

###### Returns

[`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### InstrumentationSettings

Options for instrumenting models and agents with OpenTelemetry.

Used in:

-   `Agent(instrument=...)`
-   [`Agent.instrument_all()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instrument_all)
-   [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel)

See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    tracer_provider: TracerProvider | None = None,
    meter_provider: MeterProvider | None = None,
    include_binary_content: bool = True,
    include_content: bool = True,
    version: Literal[1, 2, 3, 4, 5] = DEFAULT_INSTRUMENTATION_VERSION,
    event_mode: Literal['attributes', 'logs'] = 'attributes',
    logger_provider: LoggerProvider | None = None,
    use_aggregated_usage_attribute_names: bool = False,
)
```

Create instrumentation options.

###### Parameters

**`tracer_provider`** : `TracerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this.

**`meter_provider`** : `MeterProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this.

**`include_binary_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include binary content in the instrumentation events.

**`include_content`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events.

**`version`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[1, 2, 3, 4, 5\] _Default:_ `DEFAULT_INSTRUMENTATION_VERSION`

Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes:

-   `gen_ai.system_instructions` for instructions passed to the agent.
-   `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans.
-   `pydantic_ai.all_messages` on agent run spans. Version 3 is the same as version 2, with additional support for thinking tokens. Version 4 is the same as version 3, with GenAI semantic conventions for multimodal content: URL-based media uses type='uri' with uri and mime\_type fields (and modality for image/audio/video). Inline binary content uses type='blob' with mime\_type and content fields (and modality for image/audio/video). [https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example) Version 5 is the same as version 4, but CallDeferred and ApprovalRequired exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors.

**`event_mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['attributes', 'logs'\] _Default:_ `'attributes'`

The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events.

**`logger_provider`** : `LoggerProvider` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`.

**`use_aggregated_usage_attribute_names`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to use `gen_ai.aggregated_usage.*` attribute names for token usage on agent run spans instead of the standard `gen_ai.usage.*` names. Enable this to prevent double-counting in observability backends that aggregate span attributes across parent and child spans. Defaults to False. Note: `gen_ai.aggregated_usage.*` is a custom namespace, not part of the OpenTelemetry Semantic Conventions. It may be updated if OTel introduces an official convention.

##### messages\_to\_otel\_events

```python
def messages_to_otel_events(
    messages: list[ModelMessage],
    parameters: ModelRequestParameters | None = None,
) -> list[LogRecord]
```

Convert a list of model messages to OpenTelemetry events.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\] -- A list of OpenTelemetry events.

###### Parameters

**`messages`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

The messages to convert.

**`parameters`** : `ModelRequestParameters` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model request parameters.

### capture\_run\_messages

```python
def capture_run_messages() -> Iterator[list[_messages.ModelMessage]]
```

Context manager to access the messages used in a [`run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), or [`run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) call.

Useful when a run may raise an exception, see [model errors](/docs/ai/core-concepts/agent#model-errors) for more information.

Examples:

```python
from pydantic_ai import Agent, capture_run_messages

agent = Agent('test')

with capture_run_messages() as messages:
    try:
        result = agent.run_sync('foobar')
    except Exception:
        print(messages)
        raise
```

Note

If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only.

#### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]\]

### EndStrategy

**Default:** `Literal['early', 'graceful', 'exhaustive']`

### RunOutputDataT

Type variable for the result data of a run where `output_type` was customized on the run call.

**Default:** `TypeVar('RunOutputDataT')`

### EventStreamHandler

A function that receives agent [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], AsyncIterable[_messages.AgentStreamEvent]], Awaitable[None]]`

---

# [pydantic_ai.builtin_tools](https://pydantic.dev/docs/ai/api/pydantic-ai/builtin_tools/)

# pydantic\_ai.builtin\_tools

### AbstractBuiltinTool

**Bases:** `ABC`

A builtin tool that can be used by an agent.

This class is abstract and cannot be instantiated directly.

The builtin tools are passed to the model as part of the `ModelRequestParameters`.

#### Attributes

##### kind

Built-in tool identifier, this should be available on all built-in tools as a discriminator.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'unknown_builtin_tool'`

##### unique\_id

A unique identifier for the builtin tool.

If multiple instances of the same builtin tool can be passed to the model, subclasses should override this property to allow them to be distinguished.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### label

Human-readable label for UI display.

Subclasses should override this to provide a meaningful label.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### WebSearchTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to search the web for information.

The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models.

Supported by:

-   Anthropic
-   OpenAI Responses
-   Groq
-   Google
-   xAI
-   OpenRouter

#### Attributes

##### search\_context\_size

The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response.

Supported by:

-   OpenAI Responses
-   OpenRouter

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\] **Default:** `'medium'`

##### user\_location

The `user_location` parameter allows you to localize search results based on a user's location.

Supported by:

-   Anthropic
-   OpenAI Responses

**Type:** [`WebSearchUserLocation`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchUserLocation) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### blocked\_domains

If provided, these domains will never appear in results.

With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.

Supported by:

-   Anthropic, see [https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering)
-   Groq, see [https://console.groq.com/docs/agentic-tooling#search-settings](https://console.groq.com/docs/agentic-tooling#search-settings)
-   xAI, see [https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters](https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### allowed\_domains

If provided, only these domains will be included in results.

With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.

Supported by:

-   Anthropic, see [https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#domain-filtering)
-   Groq, see [https://console.groq.com/docs/agentic-tooling#search-settings](https://console.groq.com/docs/agentic-tooling#search-settings)
-   OpenAI Responses, see [https://platform.openai.com/docs/guides/tools-web-search](https://platform.openai.com/docs/guides/tools-web-search)
-   xAI, see [https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters](https://docs.x.ai/docs/guides/tools/search-tools#web-search-parameters)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### max\_uses

If provided, the tool will stop searching the web after the given number of uses.

Supported by:

-   Anthropic

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'web_search'`

### WebSearchUserLocation

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Allows you to localize search results based on a user's location.

Supported by:

-   Anthropic
-   OpenAI Responses

#### Attributes

##### city

The city where the user is located.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### country

The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### region

The region or state where the user is located.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timezone

The timezone of the user's location.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### XSearchTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to search X/Twitter for posts and content.

This tool is exclusive to xAI models. See [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search) for more details.

Supported by:

-   xAI

#### Attributes

##### allowed\_x\_handles

If provided, only posts from these X handles will be included (max 10).

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### excluded\_x\_handles

If provided, posts from these X handles will be excluded (max 10).

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### from\_date

If provided, only posts created on or after this datetime will be included.

Naive datetimes are interpreted as UTC by the xAI API.

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### to\_date

If provided, only posts created on or before this datetime will be included.

Naive datetimes are interpreted as UTC by the xAI API.

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### enable\_image\_understanding

Enable image analysis from X posts.

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### enable\_video\_understanding

Enable video analysis from X content.

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### include\_output

Include raw X search results in the response as [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart).

Without this, the model uses the search results internally but only returns its text summary. Enabling it gives programmatic access to searched posts, sources, and metadata.

Can also be set via [`XaiModelSettings.xai_include_x_search_output`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_x_search_output).

Supported by:

-   xAI, see [https://docs.x.ai/developers/tools/x-search](https://docs.x.ai/developers/tools/x-search)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'x_search'`

### CodeExecutionTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to execute code.

Supported by:

-   Anthropic
-   OpenAI Responses
-   Google
-   Bedrock (Nova2.0)
-   xAI

#### Attributes

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'code_execution'`

### WebFetchTool

**Bases:** `AbstractBuiltinTool`

Allows your agent to access contents from URLs.

The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models.

Supported by:

-   Anthropic
-   Google

#### Attributes

##### max\_uses

If provided, the tool will stop fetching URLs after the given number of uses.

Supported by:

-   Anthropic

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### allowed\_domains

If provided, only these domains will be fetched.

With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.

Supported by:

-   Anthropic, see [https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### blocked\_domains

If provided, these domains will never be fetched.

With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both.

Supported by:

-   Anthropic, see [https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool#domain-filtering)

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### enable\_citations

If True, enables citations for fetched content.

Supported by:

-   Anthropic

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### max\_content\_tokens

Maximum content length in tokens for fetched content.

Supported by:

-   Anthropic

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'web_fetch'`

### UrlContextTool

**Bases:** [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool)

Deprecated alias for WebFetchTool. Use WebFetchTool instead.

Overrides kind to 'url\_context' so old serialized payloads with {"kind": "url\_context", ...} can be deserialized to UrlContextTool for backward compatibility.

#### Attributes

##### kind

The kind of tool (deprecated value for backward compatibility).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'url_context'`

### ImageGenerationTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to generate images.

Supported by:

-   OpenAI Responses
-   Google

#### Attributes

##### action

Whether to generate a new image or edit an existing image.

Supported by:

-   OpenAI Responses. Default: 'auto'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['generate', 'edit', 'auto'\] **Default:** `'auto'`

##### background

Background type for the generated image.

Supported by:

-   OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['transparent', 'opaque', 'auto'\] **Default:** `'auto'`

##### input\_fidelity

Control how much effort the model will exert to match the style and features, especially facial features, of input images.

Supported by:

-   OpenAI Responses. Default: 'low'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['high', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### moderation

Moderation level for the generated image.

Supported by:

-   OpenAI Responses

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'low'\] **Default:** `'auto'`

##### model

The image generation model to use.

Supported by:

-   OpenAI Responses. Defaults to the provider's image generation model selection. Known image generation models include `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`.

This selects the underlying image generation model used by the tool; it does not change the agent's conversational model.

**Type:** `ImageGenerationModelName` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### output\_compression

Compression level for the output image.

Supported by:

-   OpenAI Responses. Only supported for 'jpeg' and 'webp' output formats. Default: 100.
-   Google (Vertex AI only). Only supported for 'jpeg' output format. Default: 75. Setting this will default `output_format` to 'jpeg' if not specified.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### output\_format

The output format of the generated image.

Supported by:

-   OpenAI Responses. Default: 'png'.
-   Google (Vertex AI only). Default: 'png', or 'jpeg' if `output_compression` is set.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['png', 'webp', 'jpeg'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### partial\_images

Number of partial images to generate in streaming mode.

Supported by:

-   OpenAI Responses. Supports 0 to 3.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### quality

The quality of the generated image.

Supported by:

-   OpenAI Responses

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'auto'\] **Default:** `'auto'`

##### size

The size of the generated image.

-   OpenAI Responses: 'auto' (default: model selects the size based on the prompt), '1024x1024', '1024x1536', '1536x1024'
-   Google (Gemini 3 Pro Image and later): '512' (Gemini 3.1 Flash Image only), '1K' (default), '2K', '4K'

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', '1024x1024', '1024x1536', '1536x1024', '512', '1K', '2K', '4K'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### aspect\_ratio

The aspect ratio to use for generated images.

Supported by:

-   Google image-generation models (Gemini)
-   OpenAI Responses (maps '1:1', '2:3', and '3:2' to supported sizes)

**Type:** `ImageAspectRatio` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'image_generation'`

### MemoryTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to use memory.

Supported by:

-   Anthropic

#### Attributes

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'memory'`

### MCPServerTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to use MCP servers.

Supported by:

-   OpenAI Responses
-   Anthropic
-   xAI

#### Attributes

##### id

A unique identifier for the MCP server.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

The URL of the MCP server to use.

For OpenAI Responses, it is possible to use `connector_id` by providing it as `x-openai-connector:<connector_id>`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### authorization\_token

Authorization header to use when making requests to the MCP server.

Supported by:

-   OpenAI Responses
-   Anthropic
-   xAI

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### description

A description of the MCP server.

Supported by:

-   OpenAI Responses
-   xAI

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### allowed\_tools

A list of tools that the MCP server can use.

Supported by:

-   OpenAI Responses
-   Anthropic
-   xAI

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### headers

Optional HTTP headers to send to the MCP server.

Use for authentication or other purposes.

Supported by:

-   OpenAI Responses
-   xAI

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### FileSearchTool

**Bases:** `AbstractBuiltinTool`

A builtin tool that allows your agent to search through uploaded files using vector search.

This tool provides a fully managed Retrieval-Augmented Generation (RAG) system that handles file storage, chunking, embedding generation, and context injection into prompts.

Supported by:

-   OpenAI Responses
-   Google (Gemini)
-   xAI (mapped to collections search)

#### Attributes

##### file\_store\_ids

The file store IDs to search through.

For OpenAI, these are the IDs of vector stores created via the OpenAI API. For Google, these are file search store names that have been uploaded and processed via the Gemini Files API. For xAI, these are collection IDs for the xAI collections search tool.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### kind

The kind of tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'file_search'`

### BUILTIN\_TOOL\_TYPES

Registry of all builtin tool types, keyed by their kind string.

This dict is populated automatically via `__init_subclass__` when tool classes are defined.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `{}`

### ImageAspectRatio

Supported aspect ratios for image generation tools.

**Default:** `Literal['21:9', '16:9', '4:3', '3:2', '1:1', '9:16', '3:4', '2:3', '5:4', '4:5']`

### ImageGenerationModelName

Known OpenAI image generation model names, or another OpenAI image model ID.

**Default:** `Literal['gpt-image-2', 'gpt-image-1.5', 'gpt-image-1', 'gpt-image-1-mini'] | str`

### DEPRECATED\_BUILTIN\_TOOLS

Set of deprecated builtin tool IDs that should not be offered in new UIs.

**Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `frozenset({UrlContextTool})`

### SUPPORTED\_BUILTIN\_TOOLS

Get the set of all builtin tool types (excluding deprecated tools).

**Default:** `frozenset(cls for cls in (BUILTIN_TOOL_TYPES.values()) if cls not in DEPRECATED_BUILTIN_TOOLS)`

---

# [pydantic_ai.capabilities](https://pydantic.dev/docs/ai/api/pydantic-ai/capabilities/)

# pydantic\_ai.capabilities

### Toolset

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that provides a toolset.

### Thinking

**Bases:** `AbstractCapability[Any]`

Enables and configures model thinking/reasoning.

Uses the unified `thinking` setting in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) to work portably across providers. Provider-specific thinking settings (e.g., `anthropic_thinking`, `openai_reasoning_effort`) take precedence when both are set.

#### Attributes

##### effort

The thinking effort level.

-   `True`: Enable thinking with the provider's default effort.
-   `False`: Disable thinking (silently ignored on always-on models).
-   `'minimal'`/`'low'`/`'medium'`/`'high'`/`'xhigh'`: Enable thinking at a specific effort level.

**Type:** `ThinkingLevel` **Default:** `True`

### PrepareTools

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that filters or modifies function tool definitions using a callable.

Wraps a [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) as a capability -- sugar for the agent-level `prepare_tools=` constructor argument, which injects this capability automatically. Filters/modifies **function** tools only; for output tools use [`PrepareOutputTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareOutputTools).

```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import PrepareTools
from pydantic_ai.tools import ToolDefinition


async def hide_admin_tools(
    ctx: RunContext[None], tool_defs: list[ToolDefinition]
) -> list[ToolDefinition] | None:
    return [td for td in tool_defs if not td.name.startswith('admin_')]


agent = Agent('openai:gpt-5', capabilities=[PrepareTools(hide_admin_tools)])
```

### HandleDeferredToolCalls

**Bases:** `AbstractCapability[AgentDepsT]`

Resolves deferred tool calls inline during an agent run using a handler function.

When tools require approval or external execution, the agent normally pauses the run and returns [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) as output. This capability intercepts deferred tool calls, calls the provided handler to resolve them, and continues the agent run automatically.

The handler receives the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and the [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests). It may return [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) with results for some or all pending calls, or return `None` to decline handling (the next capability in the chain gets a chance, otherwise the calls bubble up as `DeferredToolRequests` output).

#### Attributes

##### handler

The handler function that resolves deferred tool requests.

Receives the run context and the deferred tool requests, and returns [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) with results for some or all pending calls, or `None` to decline handling. Can be sync or async.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\], [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests)\], [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None)\]\]

### WebFetch

**Bases:** `BuiltinOrLocalTool[AgentDepsT]`

URL fetching capability.

Uses the model's builtin URL fetching when available, falling back to a local function tool (markdownify-based fetch by default) when it isn't.

The local fallback requires the `web-fetch` optional group::

pip install "pydantic-ai-slim\[web-fetch\]"

#### Attributes

##### allowed\_domains

Only fetch from these domains. Enforced locally when builtin is unavailable.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_domains`

##### blocked\_domains

Never fetch from these domains. Enforced locally when builtin is unavailable.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `blocked_domains`

##### max\_uses

Maximum number of fetches per run. Requires builtin support.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_uses`

##### enable\_citations

Enable citations for fetched content. Builtin-only; ignored by local tools.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `enable_citations`

##### max\_content\_tokens

Maximum content length in tokens. Builtin-only; ignored by local tools.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_content_tokens`

### WebSearch

**Bases:** `BuiltinOrLocalTool[AgentDepsT]`

Web search capability.

Uses the model's builtin web search when available, falling back to a local function tool (DuckDuckGo by default) when it isn't.

#### Attributes

##### search\_context\_size

Controls how much context is retrieved from the web. Builtin-only; ignored by local tools.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `search_context_size`

##### user\_location

Localize search results based on user location. Builtin-only; ignored by local tools.

**Type:** [`WebSearchUserLocation`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchUserLocation) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `user_location`

##### blocked\_domains

Domains to exclude from results. Requires builtin support.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `blocked_domains`

##### allowed\_domains

Only include results from these domains. Requires builtin support.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_domains`

##### max\_uses

Maximum number of web searches per run. Requires builtin support.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_uses`

### IncludeToolReturnSchemas

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that includes return schemas for selected tools.

When added to an agent's capabilities, this sets [`include_return_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.include_return_schema) to `True` on matching tool definitions, causing the model to receive return type information for those tools.

For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field. For other models, it is injected into the tool description as JSON text.

Per-tool overrides (`Tool(..., include_return_schema=False)`) take precedence -- this capability only sets the flag on tools that haven't explicitly opted out.

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import IncludeToolReturnSchemas

agent = Agent('openai:gpt-5', capabilities=[IncludeToolReturnSchemas()])
```

#### Attributes

##### tools

Which tools should have their return schemas included.

-   `'all'` (default): every tool gets its return schema included.
-   `Sequence[str]`: only tools whose names are listed.
-   `dict[str, Any]`: matches tools whose metadata deeply includes the specified key-value pairs.
-   Callable `(ctx, tool_def) -> bool`: custom sync or async predicate.

**Type:** `ToolSelector`\[`AgentDepsT`\] **Default:** `'all'`

### PrefixTools

**Bases:** `WrapperCapability[AgentDepsT]`

A capability that wraps another capability and prefixes its tool names.

Only the wrapped capability's tools are prefixed; other agent tools are unaffected.

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import PrefixTools, Toolset
from pydantic_ai.toolsets import FunctionToolset

toolset = FunctionToolset()

agent = Agent(
    'openai:gpt-5',
    capabilities=[
        PrefixTools(
            wrapped=Toolset(toolset),
            prefix='ns',
        ),
    ],
)
```

#### Methods

##### from\_spec

`@classmethod`

```python
def from_spec(cls, prefix: str, capability: CapabilitySpec) -> PrefixTools[Any]
```

Create from spec with a nested capability specification.

###### Returns

`PrefixTools`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

###### Parameters

**`prefix`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The prefix to add to tool names (e.g. `'mcp'` turns `'search'` into `'mcp_search'`).

**`capability`** : `CapabilitySpec`

A capability spec (same format as entries in the `capabilities` list).

### SetToolMetadata

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that merges metadata key-value pairs onto selected tools.

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import SetToolMetadata

agent = Agent('openai:gpt-5', capabilities=[SetToolMetadata(code_mode=True)])
```

### ThreadExecutor

**Bases:** `AbstractCapability[Any]`

Use a custom executor for running sync functions in threads.

By default, sync tool functions and other sync callbacks are run in threads using `anyio.to_thread.run_sync`, which creates ephemeral threads. In long-running servers (e.g. FastAPI), this can lead to thread accumulation under sustained load.

This capability provides a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) (or any [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor)) to use instead, scoped to agent runs:

```python
from concurrent.futures import ThreadPoolExecutor

from pydantic_ai import Agent
from pydantic_ai.capabilities import ThreadExecutor

executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker')
agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)])
```

To set an executor for all agents globally, use [`Agent.using_thread_executor()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.using_thread_executor).

#### Attributes

##### executor

The executor to use for running sync functions.

**Type:** `Executor`

### BuiltinOrLocalTool

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that pairs a provider builtin tool with a local fallback.

When the model supports the builtin natively, the local fallback is removed. When the model doesn't support the builtin, it is removed and the local tool stays.

Can be used directly:

```python
from pydantic_ai.capabilities import BuiltinOrLocalTool

cap = BuiltinOrLocalTool(builtin=WebSearchTool(), local=my_search_func)
```

Or subclassed to set defaults by overriding `_default_builtin`, `_default_local`, and `_requires_builtin`. The built-in [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), and [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration) capabilities are all subclasses.

#### Attributes

##### builtin

Configure the provider builtin tool.

-   `True` (default): use the default builtin tool configuration (subclasses only).
-   `False`: disable the builtin; always use the local tool.
-   An `AbstractBuiltinTool` instance: use this specific configuration.
-   A callable (`BuiltinToolFunc`): dynamically create the builtin per-run via `RunContext`.

**Type:** [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\] | [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### local

Configure the local fallback tool.

-   `None` (default): auto-detect a local fallback via `_default_local`.
-   `False`: disable the local fallback; only use the builtin.
-   A `Tool` or `AbstractToolset` instance: use this specific local tool.
-   A bare callable: automatically wrapped in a `Tool`.

**Type:** [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[..., [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### BuiltinTool

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that registers a builtin tool with the agent.

Wraps a single [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool) -- either a static [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) instance or a callable that dynamically produces one.

When `builtin_tools` is passed to [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), each item is automatically wrapped in a `BuiltinTool` capability.

#### Methods

##### from\_spec

`@classmethod`

```python
def from_spec(
    cls,
    tool: AbstractBuiltinTool | None = None,
    kwargs: Any = {},
) -> BuiltinTool[Any]
```

Create from spec.

Supports two YAML forms:

-   Flat: `{BuiltinTool: {kind: web_search, search_context_size: high}}`
-   Explicit: `{BuiltinTool: {tool: {kind: web_search}}}`

###### Returns

`BuiltinTool`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

### DynamicCapability

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that builds another capability dynamically using a function that takes the run context.

The factory is called once per agent run from [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run). The returned capability replaces this wrapper for the rest of the run, so its instructions, model settings, toolset, builtin tools, and hooks all flow through normally.

Pass a [`CapabilityFunc`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityFunc) directly to `Agent(capabilities=[...])` or `agent.run(capabilities=[...])` and it will be wrapped in a `DynamicCapability` automatically.

#### Attributes

##### capability\_func

The function that takes the run context and returns a capability or `None`.

**Type:** [`CapabilityFunc`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityFunc)\[`AgentDepsT`\]

### ImageGeneration

**Bases:** `BuiltinOrLocalTool[AgentDepsT]`

Image generation capability.

Uses the model's builtin image generation when available. When the model doesn't support it and `fallback_model` is provided, falls back to a local tool that delegates to a subagent running the specified image-capable model.

Image generation settings (`quality`, `size`, etc.) are forwarded to the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) used by both the builtin and the local fallback subagent. When passing a custom `builtin` instance, its settings are also used for the fallback subagent; capability-level fields override any `builtin` instance settings.

#### Attributes

##### fallback\_model

Model to use for image generation when the agent's model doesn't support it natively.

Must be a model that supports image generation via the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) builtin. This requires a conversational model with image generation support, not a dedicated image-only API. Examples:

-   `'openai-responses:gpt-5.4'` -- OpenAI model with image generation support
-   `'google-gla:gemini-3-pro-image-preview'` -- Google image generation model

Can be a model name string, `Model` instance, or a callable taking `RunContext` that returns a `Model` instance.

**Type:** `ImageGenerationFallbackModel` **Default:** `fallback_model`

##### action

Whether to generate a new image or edit an existing image.

Supported by: OpenAI Responses. Default: `'auto'`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['generate', 'edit', 'auto'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `action`

##### background

Background type for the generated image.

Supported by: OpenAI Responses. `'transparent'` only supported for `'png'` and `'webp'`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['transparent', 'opaque', 'auto'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `background`

##### input\_fidelity

Input fidelity for matching style/features of input images.

Supported by: OpenAI Responses. Default: `'low'`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['high', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `input_fidelity`

##### moderation

Moderation level for the generated image.

Supported by: OpenAI Responses.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'low'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `moderation`

##### image\_model

The image generation model to use.

Supported by: OpenAI Responses.

**Type:** `ImageGenerationModelName` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `image_model`

##### output\_compression

Compression level for the output image.

Supported by: OpenAI Responses (jpeg/webp, default: 100), Google Vertex AI (jpeg, default: 75).

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_compression`

##### output\_format

Output format of the generated image.

Supported by: OpenAI Responses (default: `'png'`), Google Vertex AI.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['png', 'webp', 'jpeg'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_format`

##### quality

Quality of the generated image.

Supported by: OpenAI Responses.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['low', 'medium', 'high', 'auto'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `quality`

##### size

Size of the generated image.

Supported by: OpenAI Responses (`'auto'`, `'1024x1024'`, `'1024x1536'`, `'1536x1024'`), Google (`'512'`, `'1K'`, `'2K'`, `'4K'`).

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', '1024x1024', '1024x1536', '1536x1024', '512', '1K', '2K', '4K'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `size`

##### aspect\_ratio

Aspect ratio for generated images.

Supported by: Google (Gemini), OpenAI Responses (maps `'1:1'`, `'2:3'`, `'3:2'` to sizes).

**Type:** `ImageAspectRatio` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `aspect_ratio`

### ProcessEventStream

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that forwards the agent's event stream to a user-provided async handler.

The handler receives the stream of [`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent)s emitted during model streaming and tool execution for each `ModelRequestNode` and `CallToolsNode`. Two forms are supported:

-   An [`EventStreamHandler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) -- an `async def` returning `None`. Events are forwarded to the handler while also being passed through unchanged to the rest of the capability chain, so multiple handlers (and the top-level `event_stream_handler` argument) can all see the same stream without changing each other's view. A handler that returns early stops receiving events but does not affect downstream consumers; a handler that raises propagates the exception to the rest of the run. Events are delivered synchronously, so a slow handler back-pressures the rest of the stream.
-   An `EventStreamProcessor` -- an async generator yielding [`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent)s. The events it yields replace the inner stream for downstream wrappers and consumers, so it can modify, drop, or add events.

When this capability is registered, `agent.run()` automatically enables streaming so the handler fires without requiring an explicit `event_stream_handler` argument.

Durable execution

Under the current durable-execution integrations ([Temporal](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent), [DBOS](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent), [Prefect](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent)), model streaming happens inside an activity/step rather than in the outer agent loop. This capability's `wrap_run_event_stream` hook fires for tool-call events and the final post-streaming batch, but it does **not** see individual model-response events live -- the underlying durable model consumes those inside the activity before returning. The in-flight `event_stream_handler` parameter does still observe the live events; a future refactor threading the capability chain through the activity boundary is being explored in [#4977](https://github.com/pydantic/pydantic-ai/pull/4977).

### ReinjectSystemPrompt

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that reinjects the agent's configured `system_prompt` when missing from history.

Ensures the agent's configured `system_prompt` is present at the head of the first `ModelRequest` on every model request.

Intended for callers that reconstruct a `message_history` from a source that doesn't round-trip system prompts -- UI frontends, database persistence layers, conversation compaction pipelines. By default, if any `SystemPromptPart` is already present anywhere in the history (for example, preserved from a prior run or handed off from another agent), this capability leaves the messages untouched so that existing system prompts remain authoritative. Set `replace_existing=True` to instead strip any existing `SystemPromptPart`s before prepending the agent's configured prompt -- useful when the history comes from an untrusted source (such as a UI frontend) and the server's prompt must win.

The UI adapters automatically add this capability in `manage_system_prompt='server'` mode with `replace_existing=True`. Add it explicitly with `Agent(..., capabilities=[ReinjectSystemPrompt()])` or per-run via the `capabilities=` argument on [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) to get the same behavior anywhere.

#### Attributes

##### replace\_existing

If `True`, strip any existing `SystemPromptPart`s from the history before prepending the agent's configured prompt. If `False` (the default), the capability is a no-op when any `SystemPromptPart` is already present.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

### MCP

**Bases:** `BuiltinOrLocalTool[AgentDepsT]`

MCP server capability.

Uses the model's builtin MCP server support when available, connecting directly via HTTP when it isn't.

#### Attributes

##### url

The URL of the MCP server.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `url`

##### id

Unique identifier for the MCP server. Defaults to a slug derived from the URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `id`

##### authorization\_token

Authorization header value for MCP server requests. Passed to both builtin and local.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `authorization_token`

##### headers

HTTP headers for MCP server requests. Passed to both builtin and local.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `headers`

##### allowed\_tools

Filter to only these tools. Applied to both builtin and local.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `allowed_tools`

##### description

Description of the MCP server. Builtin-only; ignored by local tools.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description`

### ProcessHistory

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that processes message history before model requests.

### CombinedCapability

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that combines multiple capabilities.

### WrapperCapability

**Bases:** `AbstractCapability[AgentDepsT]`

A capability that wraps another capability and delegates all methods.

Analogous to [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) for toolsets. Subclass and override specific methods to modify behavior while delegating the rest.

### PrepareOutputTools

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that filters or modifies output tool definitions using a callable.

Mirrors [`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) for [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput). `ctx.retry`/`ctx.max_retries` reflect the **output** retry budget (`max_output_retries`), matching the output hook lifecycle.

```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import PrepareOutputTools
from pydantic_ai.output import ToolOutput
from pydantic_ai.tools import ToolDefinition


async def only_after_first_step(
    ctx: RunContext[None], tool_defs: list[ToolDefinition]
) -> list[ToolDefinition] | None:
    return tool_defs if ctx.run_step > 0 else []


agent = Agent(
    'openai:gpt-5',
    output_type=ToolOutput(str),
    capabilities=[PrepareOutputTools(only_after_first_step)],
)
```

### HistoryProcessor

**Bases:** `ProcessHistory[AgentDepsT]`

Deprecated alias for [`ProcessHistory`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ProcessHistory).

### HookTimeoutError

**Bases:** [`TimeoutError`](https://docs.python.org/3/library/exceptions.html#TimeoutError)

Raised when a hook function exceeds its configured timeout.

### CapabilityOrdering

Ordering constraints for a capability within a combined capability chain.

Capabilities follow middleware semantics: the first capability in the list is the **outermost** layer, wrapping all others. Declare ordering constraints via [`get_ordering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_ordering) to control a capability's position in the chain regardless of how the user lists them.

When a [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) is constructed, it topologically sorts its children to satisfy these constraints, preserving user-provided order as a tiebreaker.

#### Attributes

##### position

Fixed position in the chain, or `None` for user-provided order.

**Type:** `CapabilityPosition` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### wraps

This capability wraps around (is outside of) these capabilities in the middleware chain.

Each entry can be a capability **type** (matches all instances of that type via `issubclass`) or a specific capability **instance** (matches by identity via `is`).

Note: instance refs use identity (`is`) matching, so if a capability's [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) returns a new instance, refs to the original will no longer match. Use type refs when the target capability uses per-run state isolation.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`CapabilityRef`\] **Default:** `()`

##### wrapped\_by

This capability is wrapped by (is inside of) these capabilities in the middleware chain.

Each entry can be a capability **type** (matches all instances of that type via `issubclass`) or a specific capability **instance** (matches by identity via `is`).

Note: instance refs use identity (`is`) matching, so if a capability's [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) returns a new instance, refs to the original will no longer match. Use type refs when the target capability uses per-run state isolation.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`CapabilityRef`\] **Default:** `()`

##### requires

These types must be present in the chain (no ordering implied).

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] **Default:** `()`

### AbstractCapability

**Bases:** `ABC`, `Generic[AgentDepsT]`

Abstract base class for agent capabilities.

A capability is a reusable, composable unit of agent behavior that can provide instructions, model settings, tools, and request/response hooks.

Lifecycle: capabilities are passed to an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) at construction time, where most `get_*` methods are called to collect static configuration (instructions, model settings, toolsets, builtin tools). The exception is [`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset), which is called per-run during toolset assembly. Then, on each model request during a run, the [`before_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_model_request) and [`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request) hooks are called to allow dynamic adjustments.

See the [capabilities documentation](/docs/ai/api/pydantic-ai/capabilities) for built-in capabilities.

[`get_serialization_name`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name) and [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) support YAML/JSON specs (via [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec)); they have sensible defaults and typically don't need to be overridden.

#### Attributes

##### has\_wrap\_node\_run

Whether this capability (or any sub-capability) overrides wrap\_node\_run.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### has\_wrap\_run\_event\_stream

Whether this capability (or any sub-capability) overrides wrap\_run\_event\_stream.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### apply

```python
def apply(visitor: Callable[[AbstractCapability[AgentDepsT]], None]) -> None
```

Run a visitor function on all leaf capabilities in this tree.

For a single capability, calls the visitor on itself. Overridden by [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) to recursively visit all child capabilities, and by [`WrapperCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WrapperCapability) to delegate to the wrapped capability.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_serialization\_name

`@classmethod`

```python
def get_serialization_name(cls) -> str | None
```

Return the name used for spec serialization (CamelCase class name by default).

Return None to opt out of spec-based construction.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### from\_spec

`@classmethod`

```python
def from_spec(cls, args: Any = (), kwargs: Any = {}) -> AbstractCapability[Any]
```

Create from spec arguments. Default: `cls(*args, **kwargs)`.

Override when `__init__` takes non-serializable types.

###### Returns

`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### get\_ordering

```python
def get_ordering() -> CapabilityOrdering | None
```

Return ordering constraints for this capability, or `None` for default behavior.

Override to declare a fixed position (`'outermost'` / `'innermost'`), relative ordering (`wraps` / `wrapped_by` other capability types or instances), or dependency requirements (`requires`).

[`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) uses these to topologically sort its children at construction time.

###### Returns

`CapabilityOrdering` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### for\_run

`@async`

```python
def for_run(ctx: RunContext[AgentDepsT]) -> AbstractCapability[AgentDepsT]
```

Return the capability instance to use for this agent run.

Called once per run, before `get_*()` re-extraction and before any hooks fire. Override to return a fresh instance for per-run state isolation. Default: return `self` (shared across runs).

###### Returns

`AbstractCapability`\[`AgentDepsT`\]

##### get\_instructions

```python
def get_instructions() -> AgentInstructions[AgentDepsT] | None
```

Return instructions to include in the system prompt, or None.

This method is called once at agent construction time. To get dynamic per-request behavior, return a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) or a `TemplateStr` -- not a dynamic string.

###### Returns

`AgentInstructions`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_model\_settings

```python
def get_model_settings() -> AgentModelSettings[AgentDepsT] | None
```

Return model settings to merge into the agent's defaults, or None.

This method is called once at agent construction time. Return a static `ModelSettings` dict when the settings don't change between requests. Return a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) when settings need to vary per step (e.g. based on `ctx.run_step` or `ctx.deps`).

When the callable is invoked, `ctx.model_settings` contains the merged result of all layers resolved before this capability (model defaults and agent-level settings). The returned dict is merged on top of that.

###### Returns

`AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_toolset

```python
def get_toolset() -> AgentToolset[AgentDepsT] | None
```

Return a toolset to register with the agent, or None.

###### Returns

`AgentToolset`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_builtin\_tools

```python
def get_builtin_tools() -> Sequence[AgentBuiltinTool[AgentDepsT]]
```

Return builtin tools to register with the agent.

###### Returns

[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\]

##### get\_wrapper\_toolset

```python
def get_wrapper_toolset(
    toolset: AbstractToolset[AgentDepsT],
) -> AbstractToolset[AgentDepsT] | None
```

Wrap the agent's assembled toolset, or return None to leave it unchanged.

Called per-run with the combined non-output toolset (after the [`prepare_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_tools) hook has already wrapped it). Output tools are added separately and are not included.

Unlike the other `get_*` methods which are called once at agent construction, this is called each run (after [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run)). When multiple capabilities provide wrappers, they follow middleware semantics: the first capability in the list wraps outermost (matching `wrap_*` hooks).

Use this to apply cross-cutting toolset wrappers like [`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset), [`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset), or custom [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) subclasses.

###### Returns

[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### prepare\_tools

`@async`

```python
def prepare_tools(
    ctx: RunContext[AgentDepsT],
    tool_defs: list[ToolDefinition],
) -> list[ToolDefinition]
```

Filter or modify function tool definitions for this step.

Receives **function** tools only. For [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput), override [`prepare_output_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_output_tools) -- it runs separately, with `ctx.retry`/`ctx.max_retries` reflecting the **output** retry budget instead of the function-tool budget.

Return a filtered or modified list. The result flows into both the model's request parameters and `ToolManager.tools`, so filtering also blocks tool execution.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\]

##### prepare\_output\_tools

`@async`

```python
def prepare_output_tools(
    ctx: RunContext[AgentDepsT],
    tool_defs: list[ToolDefinition],
) -> list[ToolDefinition]
```

Filter or modify output tool definitions for this step.

Receives only [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput). `ctx.retry` and `ctx.max_retries` reflect the **output** retry budget (agent-level `max_output_retries`), matching the output hook lifecycle.

Return a filtered or modified list. The result flows into both the model's request parameters and `ToolManager.tools`, so filtering also blocks tool execution.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)\]

##### before\_run

`@async`

```python
def before_run(ctx: RunContext[AgentDepsT]) -> None
```

Called before the agent run starts. Observe-only; use wrap\_run for modification.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### after\_run

`@async`

```python
def after_run(
    ctx: RunContext[AgentDepsT],
    result: AgentRunResult[Any],
) -> AgentRunResult[Any]
```

Called after the agent run completes. Can modify the result.

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### wrap\_run

`@async`

```python
def wrap_run(
    ctx: RunContext[AgentDepsT],
    handler: WrapRunHandler,
) -> AgentRunResult[Any]
```

Wraps the entire agent run. `handler()` executes the run.

If `handler()` raises and this method catches the exception and returns a result instead, the error is suppressed and the recovery result is used.

If this method does not call `handler()` (short-circuit), the run is skipped and the returned result is used directly.

Note: if the caller cancels the run (e.g. by breaking out of an `iter()` loop), this method receives an `asyncio.CancelledError`. Implementations that hold resources should handle cleanup accordingly.

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### on\_run\_error

`@async`

```python
def on_run_error(
    ctx: RunContext[AgentDepsT],
    error: BaseException,
) -> AgentRunResult[Any]
```

Called when the agent run fails with an exception.

This is the error counterpart to [`after_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_run): while `after_run` is called on success, `on_run_error` is called on failure (after [`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run) has had its chance to recover).

**Raise** the original `error` (or a different exception) to propagate it. **Return** an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) to suppress the error and recover the run.

Not called for `GeneratorExit` or `KeyboardInterrupt`.

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### before\_node\_run

`@async`

```python
def before_node_run(
    ctx: RunContext[AgentDepsT],
    node: AgentNode[AgentDepsT],
) -> AgentNode[AgentDepsT]
```

Called before each graph node executes. Can observe or replace the node.

###### Returns

`AgentNode`\[`AgentDepsT`\]

##### after\_node\_run

`@async`

```python
def after_node_run(
    ctx: RunContext[AgentDepsT],
    node: AgentNode[AgentDepsT],
    result: NodeResult[AgentDepsT],
) -> NodeResult[AgentDepsT]
```

Called after each graph node succeeds. Can modify the result (next node or `End`).

###### Returns

`NodeResult`\[`AgentDepsT`\]

##### wrap\_node\_run

`@async`

```python
def wrap_node_run(
    ctx: RunContext[AgentDepsT],
    node: AgentNode[AgentDepsT],
    handler: WrapNodeRunHandler[AgentDepsT],
) -> NodeResult[AgentDepsT]
```

Wraps execution of each agent graph node (run step).

Called for every node in the agent graph (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). `handler(node)` executes the node and returns the next node (or `End`).

Override to inspect or modify nodes before execution, inspect or modify the returned next node, call `handler` multiple times (retry), or return a different node to redirect graph progression.

Note: this hook fires when using `agent.run()`, `agent.run_stream()`, and when manually driving an [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) run with `next()`, but it does **not** fire when iterating over the run with bare `async for` (which yields stream events, not node results).

When using `agent.run()` with `event_stream_handler`, the handler wraps both streaming and graph advancement (i.e. the model call happens inside the wrapper). When using `agent.run_stream()`, the handler wraps only graph advancement -- streaming happens before the wrapper because `run_stream()` must yield the stream to the caller while the stream context is still open, which cannot happen from inside a callback.

###### Returns

`NodeResult`\[`AgentDepsT`\]

##### on\_node\_run\_error

`@async`

```python
def on_node_run_error(
    ctx: RunContext[AgentDepsT],
    node: AgentNode[AgentDepsT],
    error: Exception,
) -> NodeResult[AgentDepsT]
```

Called when a graph node fails with an exception.

This is the error counterpart to [`after_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_node_run).

**Raise** the original `error` (or a different exception) to propagate it. **Return** a next node or `End` to recover and continue the graph.

Useful for recovering from [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) by redirecting to a different node (e.g. retry with different model settings).

###### Returns

`NodeResult`\[`AgentDepsT`\]

##### wrap\_run\_event\_stream

`@async`

```python
def wrap_run_event_stream(
    ctx: RunContext[AgentDepsT],
    stream: AsyncIterable[AgentStreamEvent],
) -> AsyncIterable[AgentStreamEvent]
```

Wraps the event stream for a streamed node. Can observe or transform events.

Note: when this method is overridden (or [`Hooks.on.event`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks.on) / [`Hooks.on.run_event_stream`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks.on) are registered), `agent.run()` automatically enables streaming mode so this hook fires even without an explicit `event_stream_handler`.

###### Returns

[`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[[`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent)\]

##### before\_model\_request

`@async`

```python
def before_model_request(
    ctx: RunContext[AgentDepsT],
    request_context: ModelRequestContext,
) -> ModelRequestContext
```

Called before each model request. Can modify messages, settings, and parameters.

###### Returns

`ModelRequestContext`

##### after\_model\_request

`@async`

```python
def after_model_request(
    ctx: RunContext[AgentDepsT],
    request_context: ModelRequestContext,
    response: ModelResponse,
) -> ModelResponse
```

Called after each model response. Can modify the response before further processing.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the response and ask the model to try again. The original response is still appended to message history so the model can see what it said. Retries count against `output_retries`.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### wrap\_model\_request

`@async`

```python
def wrap_model_request(
    ctx: RunContext[AgentDepsT],
    request_context: ModelRequestContext,
    handler: WrapModelRequestHandler,
) -> ModelResponse
```

Wraps the model request. handler() calls the model.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip `on_model_request_error` and directly retry the model request with a retry prompt. If the handler was called, the model response is preserved in history for context (same as `after_model_request`).

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### on\_model\_request\_error

`@async`

```python
def on_model_request_error(
    ctx: RunContext[AgentDepsT],
    request_context: ModelRequestContext,
    error: Exception,
) -> ModelResponse
```

Called when a model request fails with an exception.

This is the error counterpart to [`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request).

**Raise** the original `error` (or a different exception) to propagate it. **Return** a [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) to suppress the error and use the response as if the model call succeeded. **Raise** [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to retry the model request with a retry prompt instead of recovering or propagating.

Not called for [`SkipModelRequest`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) or [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry).

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### before\_tool\_validate

`@async`

```python
def before_tool_validate(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: RawToolArgs,
) -> RawToolArgs
```

Modify raw args before validation.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip validation and ask the model to redo the tool call.

###### Returns

`RawToolArgs`

##### after\_tool\_validate

`@async`

```python
def after_tool_validate(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: ValidatedToolArgs,
) -> ValidatedToolArgs
```

Modify validated args. Called only on successful validation.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the validated args and ask the model to redo the tool call.

###### Returns

`ValidatedToolArgs`

##### wrap\_tool\_validate

`@async`

```python
def wrap_tool_validate(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: RawToolArgs,
    handler: WrapToolValidateHandler,
) -> ValidatedToolArgs
```

Wraps tool argument validation. handler() runs the validation.

###### Returns

`ValidatedToolArgs`

##### on\_tool\_validate\_error

`@async`

```python
def on_tool_validate_error(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: RawToolArgs,
    error: ValidationError | ModelRetry,
) -> ValidatedToolArgs
```

Called when tool argument validation fails.

This is the error counterpart to [`after_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_validate). Fires for `ValidationError` (schema mismatch) and [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) (custom validator rejection).

**Raise** the original `error` (or a different exception) to propagate it. **Return** validated args to suppress the error and continue as if validation passed.

Not called for [`SkipToolValidation`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation).

###### Returns

`ValidatedToolArgs`

##### before\_tool\_execute

`@async`

```python
def before_tool_execute(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: ValidatedToolArgs,
) -> ValidatedToolArgs
```

Modify validated args before execution.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip execution and ask the model to redo the tool call.

###### Returns

`ValidatedToolArgs`

##### after\_tool\_execute

`@async`

```python
def after_tool_execute(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: ValidatedToolArgs,
    result: Any,
) -> Any
```

Modify result after execution.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the tool result and ask the model to redo the tool call.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### wrap\_tool\_execute

`@async`

```python
def wrap_tool_execute(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: ValidatedToolArgs,
    handler: WrapToolExecuteHandler,
) -> Any
```

Wraps tool execution. handler() runs the tool.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### on\_tool\_execute\_error

`@async`

```python
def on_tool_execute_error(
    ctx: RunContext[AgentDepsT],
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: ValidatedToolArgs,
    error: Exception,
) -> Any
```

Called when tool execution fails with an exception.

This is the error counterpart to [`after_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_execute).

**Raise** the original `error` (or a different exception) to propagate it. **Return** any value to suppress the error and use it as the tool result. **Raise** [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to redo the tool call instead of recovering or propagating.

Not called for control flow exceptions ([`SkipToolExecution`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution), [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred), [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired)) or retry signals ([`ToolRetryError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ToolRetryError) from [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry)). Use [`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute) to intercept retries.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### before\_output\_validate

`@async`

```python
def before_output_validate(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: RawOutput,
) -> RawOutput
```

Modify raw model output before validation/parsing.

The primary hook for pre-parse repair and normalization of model output. Fires only for structured output that requires parsing: prompted, native, tool, and union output. Does **not** fire for plain text or image output.

For structured text output, `output` is the raw text string from the model. For tool output, `output` is the raw tool arguments (string or dict).

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip validation and ask the model to try again with a custom message.

During streaming, this hook fires on every partial validation attempt as well as the final result. Check `ctx.partial_output` to distinguish and avoid expensive work on partial results.

###### Returns

`RawOutput`

##### after\_output\_validate

`@async`

```python
def after_output_validate(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: Any,
) -> Any
```

Modify validated output after successful parsing. Called only on success.

`output` is the **semantic value** the model was asked to produce -- e.g., a `MyModel` instance for `output_type=MyModel`, or `42` for `output_type=int`, or the input to a single-arg output function. For multi-arg output functions, this is the `dict` of arguments (the genuine multi-value input).

Note: this differs from _tool_ hooks (`after_tool_validate`), which always see `dict[str, Any]` -- tool args follow the schema contract. Output hooks see the semantic output value, regardless of how it's internally represented during validation.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the validated output and ask the model to try again.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### wrap\_output\_validate

`@async`

```python
def wrap_output_validate(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: RawOutput,
    handler: WrapOutputValidateHandler,
) -> Any
```

Wraps output validation. handler(output) performs the validation.

[`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) from within the handler goes to [`on_output_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_validate_error). `ModelRetry` raised directly (not from the handler) bypasses the error hook.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### on\_output\_validate\_error

`@async`

```python
def on_output_validate_error(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: RawOutput,
    error: ValidationError | ModelRetry,
) -> Any
```

Called when output validation fails.

This is the error counterpart to [`after_output_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_output_validate).

**Raise** the original `error` (or a different exception) to propagate it. **Return** validated output to suppress the error and continue.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### before\_output\_process

`@async`

```python
def before_output_process(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: Any,
) -> Any
```

Modify validated output before processing (extraction, output function call).

`output` is the **semantic value** -- e.g., a `MyModel` instance or `42`, matching `after_output_validate`. For multi-arg output functions, it's the `dict` of args. See [`after_output_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_output_validate) for a full explanation of the semantic-value contract.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to skip processing and ask the model to try again.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### after\_output\_process

`@async`

```python
def after_output_process(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: Any,
) -> Any
```

Modify result after output processing.

Raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject the result and ask the model to try again.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### wrap\_output\_process

`@async`

```python
def wrap_output_process(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: Any,
    handler: WrapOutputProcessHandler,
) -> Any
```

Wraps output processing. handler(output) runs extraction + output function call.

[`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) bypasses [`on_output_process_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_process_error) (treated as control flow, not an error).

During streaming, this fires only when partial validation succeeds, and on the final result. Check `ctx.partial_output` to skip expensive work on partial results.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### on\_output\_process\_error

`@async`

```python
def on_output_process_error(
    ctx: RunContext[AgentDepsT],
    output_context: OutputContext,
    output: Any,
    error: Exception,
) -> Any
```

Called when output processing fails with an exception.

This is the error counterpart to [`after_output_process`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_output_process).

**Raise** the original `error` (or a different exception) to propagate it. **Return** any value to suppress the error and use it as the output.

Not called for retry signals ([`ToolRetryError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ToolRetryError) from [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry)).

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### handle\_deferred\_tool\_calls

`@async`

```python
def handle_deferred_tool_calls(
    ctx: RunContext[AgentDepsT],
    requests: DeferredToolRequests,
) -> DeferredToolResults | None
```

Handle deferred tool calls (approval-required or externally-executed) inline during an agent run.

Called by `ToolManager` when:

-   a tool raises [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired) or [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) during execution, or
-   the model calls a tool registered with `requires_approval=True` (see [Human-in-the-Loop Tool Approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval)) or a tool backed by [external execution](/docs/ai/tools-toolsets/deferred-tools#external-tool-execution).

Uses accumulation dispatch: each capability in the chain receives remaining unresolved requests and can resolve some or all of them. Results are merged and unresolved calls are passed to the next capability.

**Return** a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) to resolve some or all calls. **Return** `None` to leave all calls unresolved.

###### Returns

[`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### prefix\_tools

```python
def prefix_tools(prefix: str) -> PrefixTools[AgentDepsT]
```

Returns a new capability that wraps this one and prefixes its tool names.

Only this capability's tools are prefixed; other agent tools are unaffected.

###### Returns

`PrefixTools`\[`AgentDepsT`\]

### OutputContext

Context about the output being processed, passed to output hooks.

#### Attributes

##### mode

The schema's output mode ('text', 'native', 'prompted', 'tool', 'image', 'auto').

This reflects the configured schema, not the format of this particular response. For example, a `ToolOutputSchema` with a `text_processor` (hybrid mode) reports `'tool'` even if the model returned text -- check [`tool_call`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.OutputContext.tool_call) to distinguish.

**Type:** `OutputMode`

##### output\_type

The resolved output type (e.g. MyModel, str). For output functions, the function's input type (what the model produces).

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### object\_def

The output object definition (schema, name, description), if structured output.

**Type:** `OutputObjectDefinition` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### has\_function

Whether there's an output function to call in the execute step.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### tool\_call

The tool call part, for tool-based output. `None` when the current output did not arrive via a tool call (text or image).

**Type:** [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_def

The tool definition, for tool-based output. `None` when the current output did not arrive via a tool call.

**Type:** [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### allows\_text

Whether the schema accepts text output (including via a `text_processor` on a `ToolOutputSchema`).

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### allows\_image

Whether the schema accepts image output.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### allows\_deferred\_tools

Whether the schema accepts deferred tool requests as output.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

### Hooks

**Bases:** `AbstractCapability[AgentDepsT]`

Register hook functions via decorators or constructor kwargs.

For extension developers building reusable capabilities, subclass [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) directly. For application code that needs a few hooks without the ceremony of a subclass, use `Hooks`.

Example using decorators::

hooks = Hooks()

@hooks.on.before\_model\_request async def log\_request(ctx, request\_context): print(f'Request: {request\_context}') return request\_context

agent = Agent('openai:gpt-5', capabilities=\[hooks\])

Example using constructor kwargs::

agent = Agent('openai:gpt-5', capabilities=\[ Hooks(before\_model\_request=log\_request) \])

#### Attributes

##### on

Decorator namespace for registering hook functions.

**Type:** `_HookRegistration`\[`AgentDepsT`\]

### CapabilityFunc

A sync/async function which takes a run context and returns a capability.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], AbstractCapability[AgentDepsT] | None | Awaitable[AbstractCapability[AgentDepsT] | None]]`

### AgentNode

Type alias for an agent graph node (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'_agent_graph.AgentNode[AgentDepsT, Any]'`

### NodeResult

Type alias for the result of executing an agent graph node: either the next node or `End`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'_agent_graph.AgentNode[AgentDepsT, Any] | End[FinalResult[Any]]'`

### WrapRunHandler

Handler type for [`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[], Awaitable[AgentRunResult[Any]]]'`

### WrapNodeRunHandler

Handler type for [`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[_agent_graph.AgentNode[AgentDepsT, Any]], Awaitable[_agent_graph.AgentNode[AgentDepsT, Any] | End[FinalResult[Any]]]]'`

### WrapModelRequestHandler

Handler type for [`wrap_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_model_request).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'Callable[[ModelRequestContext], Awaitable[ModelResponse]]'`

### AgentCapability

A capability or a [`CapabilityFunc`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityFunc) that takes a run context and returns one.

Use as the item type for `Agent(capabilities=[...])` and `agent.run(capabilities=[...])`. Functions are wrapped in a [`DynamicCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.DynamicCapability) automatically.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `AbstractCapability[AgentDepsT] | CapabilityFunc[AgentDepsT]`

### RawToolArgs

Type alias for raw (pre-validation) tool arguments.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `str | dict[str, Any]`

### ValidatedToolArgs

Type alias for validated tool arguments.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[str, Any]`

### WrapToolValidateHandler

Handler type for [`wrap_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_validate).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RawToolArgs], Awaitable[ValidatedToolArgs]]`

### CAPABILITY\_TYPES

Registry of all capability types that have a serialization name, mapping name to class.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractCapability`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]\] **Default:** `{name: cls for cls in (BuiltinTool, ImageGeneration, IncludeToolReturnSchemas, MCP, PrefixTools, PrepareTools, ProcessHistory, ReinjectSystemPrompt, SetToolMetadata, Thinking, Toolset, WebFetch, WebSearch) if (name := (cls.get_serialization_name())) is not None}`

### WrapToolExecuteHandler

Handler type for [`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[ValidatedToolArgs], Awaitable[Any]]`

### RawOutput

Type alias for raw output data (text or tool args).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `str | dict[str, Any]`

### WrapOutputValidateHandler

Handler type for wrap\_output\_validate.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RawOutput], Awaitable[Any]]`

### WrapOutputProcessHandler

Handler type for wrap\_output\_process.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[Any], Awaitable[Any]]`

### CapabilityPosition

Position tier for a capability in the middleware chain.

-   `'outermost'`: in the outermost tier, before all non-outermost capabilities. Multiple capabilities can declare `'outermost'`; original list order breaks ties within the tier, and `wraps`/`wrapped_by` edges refine order further.
-   `'innermost'`: in the innermost tier, after all non-innermost capabilities. Same tie-breaking rules apply.

**Default:** `Literal['outermost', 'innermost']`

### CapabilityRef

Reference to a capability -- either a type (matches all instances of that type) or a specific instance (matches by identity).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'type[AbstractCapability[Any]] | AbstractCapability[Any]'`

---

# [pydantic_ai.common_tools](https://pydantic.dev/docs/ai/api/pydantic-ai/common_tools/)

# pydantic\_ai.common\_tools

### DuckDuckGoResult

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A DuckDuckGo search result.

#### Attributes

##### title

The title of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### href

The URL of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### body

The body of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### DuckDuckGoSearchTool

The DuckDuckGo search tool.

#### Attributes

##### client

The DuckDuckGo search client.

**Type:** `DDGS`

##### max\_results

The maximum number of results. If None, returns results only from the first response.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(query: str) -> list[DuckDuckGoResult]
```

Searches DuckDuckGo for the given query and returns the results.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`DuckDuckGoResult`\] -- The search results.

###### Parameters

**`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The query to search for.

### duckduckgo\_search\_tool

```python
def duckduckgo_search_tool(
    duckduckgo_client: DDGS | None = None,
    max_results: int | None = None,
)
```

Creates a DuckDuckGo search tool.

#### Parameters

**`duckduckgo_client`** : `DDGS` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The DuckDuckGo search client.

**`max_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The maximum number of results. If None, returns results only from the first response.

Exa tools for Pydantic AI agents.

Provides web search, content retrieval, and AI-powered answer capabilities using the Exa API, a neural search engine that finds high-quality, relevant results across billions of web pages.

### ExaSearchResult

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

An Exa search result with content.

See [Exa Search API documentation](https://docs.exa.ai/reference/search) for more information.

#### Attributes

##### title

The title of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

The URL of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### published\_date

The published date of the content, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### author

The author of the content, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### text

The text content of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### ExaAnswerResult

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

An Exa answer result with citations.

See [Exa Answer API documentation](https://docs.exa.ai/reference/answer) for more information.

#### Attributes

##### answer

The AI-generated answer to the query.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### citations

Citations supporting the answer.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

### ExaContentResult

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Content retrieved from a URL.

See [Exa Contents API documentation](https://docs.exa.ai/reference/get-contents) for more information.

#### Attributes

##### url

The URL of the content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### title

The title of the page.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### text

The text content of the page.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### author

The author of the content, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### published\_date

The published date of the content, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

### ExaSearchTool

The Exa search tool.

#### Attributes

##### client

The Exa async client.

**Type:** `AsyncExa`

##### num\_results

The number of results to return.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### max\_characters

Maximum characters of text content per result, or None for no limit.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(
    query: str,
    search_type: Literal['auto', 'keyword', 'neural', 'fast', 'deep'] = 'auto',
) -> list[ExaSearchResult]
```

Searches Exa for the given query and returns the results with content.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaSearchResult`\] -- The search results with text content.

###### Parameters

**`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The search query to execute with Exa.

**`search_type`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'keyword', 'neural', 'fast', 'deep'\] _Default:_ `'auto'`

The type of search to perform. 'auto' automatically chooses the best search type, 'keyword' for exact matches, 'neural' for semantic search, 'fast' for speed-optimized search, 'deep' for comprehensive multi-query search.

### ExaFindSimilarTool

The Exa find similar tool.

#### Attributes

##### client

The Exa async client.

**Type:** `AsyncExa`

##### num\_results

The number of results to return.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(url: str, exclude_source_domain: bool = True) -> list[ExaSearchResult]
```

Finds pages similar to the given URL and returns them with content.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaSearchResult`\] -- Similar pages with text content.

###### Parameters

**`url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The URL to find similar pages for.

**`exclude_source_domain`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to exclude results from the same domain as the input URL. Defaults to True.

### ExaGetContentsTool

The Exa get contents tool.

#### Attributes

##### client

The Exa async client.

**Type:** `AsyncExa`

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(urls: list[str]) -> list[ExaContentResult]
```

Gets the content of the specified URLs.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ExaContentResult`\] -- The content of each URL.

###### Parameters

**`urls`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A list of URLs to get content for.

### ExaAnswerTool

The Exa answer tool.

#### Attributes

##### client

The Exa async client.

**Type:** `AsyncExa`

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(query: str) -> ExaAnswerResult
```

Generates an AI-powered answer to the query with citations.

###### Returns

`ExaAnswerResult` -- An answer with supporting citations from web sources.

###### Parameters

**`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The question to answer.

### ExaToolset

**Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset)

A toolset that provides Exa search tools with a shared client.

This is more efficient than creating individual tools when using multiple Exa tools, as it shares a single API client across all tools.

Example:

```python
from pydantic_ai import Agent
from pydantic_ai.common_tools.exa import ExaToolset

toolset = ExaToolset(api_key='your-api-key')
agent = Agent('openai:gpt-5.2', toolsets=[toolset])
```

#### Methods

##### \_\_init\_\_

```python
def __init__(
    api_key: str,
    num_results: int = 5,
    max_characters: int | None = None,
    include_search: bool = True,
    include_find_similar: bool = True,
    include_get_contents: bool = True,
    include_answer: bool = True,
    id: str | None = None,
)
```

Creates an Exa toolset with a shared client.

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The Exa API key.

You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai).

**`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5`

The number of results to return for search and find\_similar. Defaults to 5.

**`max_characters`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit).

**`include_search`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the search tool. Defaults to True.

**`include_find_similar`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the find\_similar tool. Defaults to True.

**`include_get_contents`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the get\_contents tool. Defaults to True.

**`include_answer`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the answer tool. Defaults to True.

**`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the toolset, used for durable execution environments.

### exa\_search\_tool

```python
def exa_search_tool(
    api_key: str,
    num_results: int = 5,
    max_characters: int | None = None,
) -> Tool[Any]
def exa_search_tool(
    client: AsyncExa,
    num_results: int = 5,
    max_characters: int | None = None,
) -> Tool[Any]
```

Creates an Exa search tool.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Exa API key. Required if `client` is not provided.

You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai).

**`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools.

**`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5`

The number of results to return. Defaults to 5.

**`max_characters`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit).

### exa\_find\_similar\_tool

```python
def exa_find_similar_tool(api_key: str, num_results: int = 5) -> Tool[Any]
def exa_find_similar_tool(client: AsyncExa, num_results: int = 5) -> Tool[Any]
```

Creates an Exa find similar tool.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Exa API key. Required if `client` is not provided.

You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai).

**`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools.

**`num_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `5`

The number of similar results to return. Defaults to 5.

### exa\_get\_contents\_tool

```python
def exa_get_contents_tool(api_key: str) -> Tool[Any]
def exa_get_contents_tool(client: AsyncExa) -> Tool[Any]
```

Creates an Exa get contents tool.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Exa API key. Required if `client` is not provided.

You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai).

**`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools.

### exa\_answer\_tool

```python
def exa_answer_tool(api_key: str) -> Tool[Any]
def exa_answer_tool(client: AsyncExa) -> Tool[Any]
```

Creates an Exa answer tool.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Exa API key. Required if `client` is not provided.

You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai).

**`client`** : `AsyncExa` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools.

### TavilySearchResult

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A Tavily search result.

See [Tavily Search Endpoint documentation](https://docs.tavily.com/api-reference/endpoint/search) for more information.

#### Attributes

##### title

The title of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### url

The URL of the search result..

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### content

A short description of the search result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### score

The relevance score of the search result.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

### TavilySearchTool

The Tavily search tool.

#### Attributes

##### client

The Tavily search client.

**Type:** `AsyncTavilyClient`

##### max\_results

The maximum number of results. If None, the Tavily default is used.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### \_\_call\_\_

`@async`

```python
def __call__(
    query: str,
    search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = 'basic',
    topic: Literal['general', 'news', 'finance'] = 'general',
    time_range: Literal['day', 'week', 'month', 'year'] | None = None,
    include_domains: list[str] | None = None,
    exclude_domains: list[str] | None = None,
) -> list[TavilySearchResult]
```

Searches Tavily for the given query and returns the results.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`TavilySearchResult`\] -- A list of search results from Tavily.

###### Parameters

**`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The search query to execute with Tavily.

**`search_depth`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['basic', 'advanced', 'fast', 'ultra-fast'\] _Default:_ `'basic'`

The depth of the search.

**`topic`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['general', 'news', 'finance'\] _Default:_ `'general'`

The category of the search.

**`time_range`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['day', 'week', 'month', 'year'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The time range back from the current date to filter results.

**`include_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

List of domains to specifically include in the search results.

**`exclude_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

List of domains to specifically exclude from the search results.

### tavily\_search\_tool

```python
def tavily_search_tool(
    api_key: str,
    max_results: int | None = None,
    search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = _UNSET,
    topic: Literal['general', 'news', 'finance'] = _UNSET,
    time_range: Literal['day', 'week', 'month', 'year'] | None = _UNSET,
    include_domains: list[str] | None = _UNSET,
    exclude_domains: list[str] | None = _UNSET,
) -> Tool[Any]
def tavily_search_tool(
    client: AsyncTavilyClient,
    max_results: int | None = None,
    search_depth: Literal['basic', 'advanced', 'fast', 'ultra-fast'] = _UNSET,
    topic: Literal['general', 'news', 'finance'] = _UNSET,
    time_range: Literal['day', 'week', 'month', 'year'] | None = _UNSET,
    include_domains: list[str] | None = _UNSET,
    exclude_domains: list[str] | None = _UNSET,
) -> Tool[Any]
```

Creates a Tavily search tool.

`max_results` is always developer-controlled and does not appear in the LLM tool schema. Other parameters, when provided, are fixed for all searches and hidden from the LLM's tool schema. Parameters left unset remain available for the LLM to set per-call.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Tavily API key. Required if `client` is not provided.

You can get one by signing up at [https://app.tavily.com/home](https://app.tavily.com/home).

**`client`** : `AsyncTavilyClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing AsyncTavilyClient. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tool instances.

**`max_results`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The maximum number of results. If None, the Tavily default is used.

**`search_depth`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['basic', 'advanced', 'fast', 'ultra-fast'\] _Default:_ `_UNSET`

The depth of the search.

**`topic`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['general', 'news', 'finance'\] _Default:_ `_UNSET`

The category of the search.

**`time_range`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['day', 'week', 'month', 'year'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET`

The time range back from the current date to filter results.

**`include_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET`

List of domains to specifically include in the search results.

**`exclude_domains`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `_UNSET`

List of domains to specifically exclude from the search results.

---

# [pydantic_ai — Concurrency](https://pydantic.dev/docs/ai/api/pydantic-ai/concurrency/)

# pydantic\_ai -- Concurrency

### ConcurrencyLimitedModel

**Bases:** `WrapperModel`

A model wrapper that limits concurrent requests to the underlying model.

This wrapper applies concurrency limiting at the model level, ensuring that the number of concurrent requests to the model does not exceed the configured limit. This is useful for:

-   Respecting API rate limits
-   Managing resource usage
-   Sharing a concurrency pool across multiple models

Example usage:

```python
from pydantic_ai import Agent
from pydantic_ai.models.concurrency import ConcurrencyLimitedModel

# Limit to 5 concurrent requests
model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5)
agent = Agent(model)

# Or share a limiter across multiple models
from pydantic_ai import ConcurrencyLimiter  # noqa E402

shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool')
model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter)
model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter)
```

#### Methods

##### \_\_init\_\_

```python
def __init__(
    wrapped: Model | KnownModelName,
    limiter: int | ConcurrencyLimit | AbstractConcurrencyLimiter,
)
```

Initialize the ConcurrencyLimitedModel.

###### Parameters

**`wrapped`** : `Model` | `KnownModelName`

The model to wrap, either a Model instance or a known model name.

**`limiter`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) | [`AbstractConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AbstractConcurrencyLimiter)

The concurrency limit configuration. Can be:

-   An `int`: Simple limit on concurrent operations (unlimited queue).
-   A `ConcurrencyLimit`: Full configuration with optional backpressure.
-   An `AbstractConcurrencyLimiter`: A pre-created limiter for sharing across models.

##### request

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Make a request to the model with concurrency limiting.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### count\_tokens

`@async`

```python
def count_tokens(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> RequestUsage
```

Count tokens with concurrency limiting.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Make a streaming request to the model with concurrency limiting.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

### limit\_model\_concurrency

```python
def limit_model_concurrency(
    model: Model | KnownModelName,
    limiter: AnyConcurrencyLimit,
) -> Model
```

Wrap a model with concurrency limiting.

This is a convenience function to wrap a model with concurrency limiting. If the limiter is None, the model is returned unchanged.

Example:

```python
from pydantic_ai.models.concurrency import limit_model_concurrency

model = limit_model_concurrency('openai:gpt-4o', limiter=5)
```

#### Returns

`Model` -- The wrapped model with concurrency limiting, or the original model if limiter is None.

#### Parameters

**`model`** : `Model` | `KnownModelName`

The model to wrap.

**`limiter`** : [`AnyConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AnyConcurrencyLimit)

The concurrency limit configuration.

### AbstractConcurrencyLimiter

**Bases:** `ABC`

Abstract base class for concurrency limiters.

Subclass this to create custom concurrency limiters (e.g., Redis-backed distributed limiters).

Example:

```python
from pydantic_ai.concurrency import AbstractConcurrencyLimiter


class RedisConcurrencyLimiter(AbstractConcurrencyLimiter):
    def __init__(self, redis_client, key: str, max_running: int):
        self._redis = redis_client
        self._key = key
        self._max_running = max_running

    async def acquire(self, source: str) -> None:
        # Implement Redis-based distributed locking
        ...

    def release(self) -> None:
        # Release the Redis lock
        ...
```

#### Methods

##### acquire

`@abstractmethod`

`@async`

```python
def acquire(source: str) -> None
```

Acquire a slot, waiting if necessary.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Identifier for observability (e.g., 'model:gpt-4o').

##### release

`@abstractmethod`

```python
def release() -> None
```

Release a slot.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### ConcurrencyLimiter

**Bases:** [`AbstractConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.AbstractConcurrencyLimiter)

A concurrency limiter that tracks waiting operations for observability.

This class wraps an anyio.CapacityLimiter and tracks the number of waiting operations. When an operation has to wait to acquire a slot, a span is created for observability purposes.

#### Attributes

##### name

Name of the limiter for observability.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### waiting\_count

Number of operations currently waiting to acquire a slot.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### running\_count

Number of operations currently running.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### available\_count

Number of slots available.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### max\_running

Maximum concurrent operations allowed.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    max_running: int,
    max_queued: int | None = None,
    name: str | None = None,
    tracer: Tracer | None = None,
)
```

Initialize the ConcurrencyLimiter.

###### Parameters

**`max_running`** : [`int`](https://docs.python.org/3/library/functions.html#int)

Maximum number of concurrent operations.

**`max_queued`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum queue depth before raising ConcurrencyLimitExceeded.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for this limiter, used for observability when sharing a limiter across multiple models or agents.

**`tracer`** : `Tracer` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

OpenTelemetry tracer for span creation.

##### from\_limit

`@classmethod`

```python
def from_limit(
    cls,
    limit: int | ConcurrencyLimit,
    name: str | None = None,
    tracer: Tracer | None = None,
) -> Self
```

Create a ConcurrencyLimiter from a ConcurrencyLimit configuration.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A configured ConcurrencyLimiter.

###### Parameters

**`limit`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit)

Either an int for simple limiting or a ConcurrencyLimit for full config.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for this limiter, used for observability.

**`tracer`** : `Tracer` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

OpenTelemetry tracer for span creation.

##### acquire

`@async`

```python
def acquire(source: str) -> None
```

Acquire a slot, creating a span if waiting is required.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Identifier for the source of this acquisition (e.g., 'agent:my-agent' or 'model:gpt-4').

##### release

```python
def release() -> None
```

Release a slot.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### ConcurrencyLimit

Configuration for concurrency limiting with optional backpressure.

#### Constructor Parameters

**`max_running`** : [`int`](https://docs.python.org/3/library/functions.html#int)

Maximum number of concurrent operations allowed.

**`max_queued`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum number of operations waiting in the queue. If None, the queue is unlimited. If exceeded, raises `ConcurrencyLimitExceeded`.

### AnyConcurrencyLimit

Type alias for concurrency limit configuration.

Can be:

-   An `int`: Simple limit on concurrent operations (unlimited queue).
-   A `ConcurrencyLimit`: Full configuration with optional backpressure.
-   An `AbstractConcurrencyLimiter`: A pre-created limiter instance for sharing across multiple models/agents.
-   `None`: No concurrency limiting (default).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'int | ConcurrencyLimit | AbstractConcurrencyLimiter | None'`

### ConcurrencyLimitExceeded

**Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError)

Error raised when the concurrency queue depth exceeds max\_queued.

---

# [pydantic_ai.direct](https://pydantic.dev/docs/ai/api/pydantic-ai/direct/)

# pydantic\_ai.direct

Methods for making imperative requests to language models with minimal abstraction.

These methods allow you to make requests to LLMs where the only abstraction is input and output schema translation so you can use all models with the same API.

These methods are thin wrappers around [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) implementations.

### StreamedResponseSync

Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator.

This class must be used as a context manager with the `with` statement.

#### Attributes

##### response

Get the current state of the response.

**Type:** [`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### model\_name

Get the model name of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

Get the timestamp of the response.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

#### Methods

##### \_\_iter\_\_

```python
def __iter__() -> Iterator[messages.ModelResponseStreamEvent]
```

Stream the response as an iterable of [`ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)s.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`messages.ModelResponseStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseStreamEvent)\]

##### get

```python
def get() -> messages.ModelResponse
```

Build a ModelResponse from the data received from the stream so far.

###### Returns

[`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### usage

```python
def usage() -> RequestUsage
```

Get the usage of the response so far.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

### model\_request

`@async`

```python
def model_request(
    model: models.Model | models.KnownModelName | str,
    messages: Sequence[messages.ModelMessage],
    model_settings: settings.ModelSettings | None = None,
    model_request_parameters: models.ModelRequestParameters | None = None,
    instrument: instrumented_models.InstrumentationSettings | bool | None = None,
) -> messages.ModelResponse
```

Make a non-streamed request to a model.

model\_request\_example.py

```py
from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request


async def main():
  model_response = await model_request(
      'anthropic:claude-haiku-4-5',
      [ModelRequest.user_text_prompt('What is the capital of France?')]  # (1)
  )
  print(model_response)
  '''
  ModelResponse(
      parts=[TextPart(content='The capital of France is Paris.')],
      usage=RequestUsage(input_tokens=56, output_tokens=7),
      model_name='claude-haiku-4-5',
      timestamp=datetime.datetime(...),
  )
  '''
```

See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details.

#### Returns

[`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- The model response and token usage associated with the request.

#### Parameters

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently.

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

Messages to send to the model

**`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model settings

**`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model request parameters

**`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used.

### model\_request\_sync

```python
def model_request_sync(
    model: models.Model | models.KnownModelName | str,
    messages: Sequence[messages.ModelMessage],
    model_settings: settings.ModelSettings | None = None,
    model_request_parameters: models.ModelRequestParameters | None = None,
    instrument: instrumented_models.InstrumentationSettings | bool | None = None,
) -> messages.ModelResponse
```

Make a Synchronous, non-streamed request to a model.

This is a convenience method that wraps [`model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

model\_request\_sync\_example.py

```py
from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync

model_response = model_request_sync(
  'anthropic:claude-haiku-4-5',
  [ModelRequest.user_text_prompt('What is the capital of France?')]  # (1)
)
print(model_response)
'''
ModelResponse(
  parts=[TextPart(content='The capital of France is Paris.')],
  usage=RequestUsage(input_tokens=56, output_tokens=7),
  model_name='claude-haiku-4-5',
  timestamp=datetime.datetime(...),
)
'''
```

See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details.

#### Returns

[`messages.ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) -- The model response and token usage associated with the request.

#### Parameters

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently.

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

Messages to send to the model

**`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model settings

**`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model request parameters

**`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used.

### model\_request\_stream

```python
def model_request_stream(
    model: models.Model | models.KnownModelName | str,
    messages: Sequence[messages.ModelMessage],
    model_settings: settings.ModelSettings | None = None,
    model_request_parameters: models.ModelRequestParameters | None = None,
    instrument: instrumented_models.InstrumentationSettings | bool | None = None,
) -> AbstractAsyncContextManager[models.StreamedResponse]
```

Make a streamed async request to a model.

model\_request\_stream\_example.py

```py
from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_stream


async def main():
  messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')]  # (1)
  async with model_request_stream('openai:gpt-5-mini', messages) as stream:
      chunks = []
      async for chunk in stream:
          chunks.append(chunk)
      print(chunks)
      '''
      [
          PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')),
          FinalResultEvent(tool_name=None, tool_call_id=None),
          PartDeltaEvent(
              index=0, delta=TextPartDelta(content_delta='a German-born theoretical ')
          ),
          PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')),
          PartEndEvent(
              index=0,
              part=TextPart(
                  content='Albert Einstein was a German-born theoretical physicist.'
              ),
          ),
      ]
      '''
```

See [`ModelRequest.user_text_prompt`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.user_text_prompt) for details.

#### Returns

`AbstractAsyncContextManager`\[[`models.StreamedResponse`](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse)\] -- A [stream response](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse) async context manager.

#### Parameters

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently.

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

Messages to send to the model

**`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model settings

**`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model request parameters

**`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used.

### model\_request\_stream\_sync

```python
def model_request_stream_sync(
    model: models.Model | models.KnownModelName | str,
    messages: Sequence[messages.ModelMessage],
    model_settings: settings.ModelSettings | None = None,
    model_request_parameters: models.ModelRequestParameters | None = None,
    instrument: instrumented_models.InstrumentationSettings | bool | None = None,
) -> StreamedResponseSync
```

Make a streamed synchronous request to a model.

This is the synchronous version of [`model_request_stream`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream). It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface.

model\_request\_stream\_sync\_example.py

```py

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_stream_sync

messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')]
with model_request_stream_sync('openai:gpt-5-mini', messages) as stream:
    chunks = []
    for chunk in stream:
        chunks.append(chunk)
    print(chunks)
    '''
    [
        PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(
            index=0, delta=TextPartDelta(content_delta='a German-born theoretical ')
        ),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')),
        PartEndEvent(
            index=0,
            part=TextPart(
                content='Albert Einstein was a German-born theoretical physicist.'
            ),
        ),
    ]
    '''
```

#### Returns

`StreamedResponseSync` -- A [sync stream response](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.StreamedResponseSync) context manager.

#### Parameters

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently.

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`messages.ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

Messages to send to the model

**`model_settings`** : [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model settings

**`model_request_parameters`** : [`models.ModelRequestParameters`](/docs/ai/api/models/base/#pydantic_ai.models.ModelRequestParameters) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

optional model request parameters

**`instrument`** : `instrumented_models.InstrumentationSettings` | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) is used.

---

# [pydantic_ai.durable_exec](https://pydantic.dev/docs/ai/api/pydantic-ai/durable_exec/)

# pydantic\_ai.durable\_exec

### PydanticAIWorkflow

Temporal Workflow base class that provides `__pydantic_ai_agents__` for direct agent registration.

### TemporalRunContext

**Bases:** `RunContext[AgentDepsT]`

The [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) subclass to use to serialize and deserialize the run context for use inside a Temporal activity.

By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent).

#### Methods

##### serialize\_run\_context

`@classmethod`

```python
def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]
```

Serialize the run context to a `dict[str, Any]`.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### deserialize\_run\_context

`@classmethod`

```python
def deserialize_run_context(
    cls,
    ctx: dict[str, Any],
    deps: Any,
) -> TemporalRunContext[Any]
```

Deserialize the run context from a `dict[str, Any]`.

###### Returns

`TemporalRunContext`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

### LogfirePlugin

**Bases:** `SimplePlugin`

Temporal client plugin for Logfire.

### TemporalAgent

**Bases:** `WrapperAgent[AgentDepsT, OutputDataT]`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    wrapped: AbstractAgent[AgentDepsT, OutputDataT],
    name: str | None = None,
    models: Mapping[str, Model] | None = None,
    provider_factory: TemporalProviderFactory | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    activity_config: ActivityConfig | None = None,
    model_activity_config: ActivityConfig | None = None,
    toolset_activity_config: dict[str, ActivityConfig] | None = None,
    tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None,
    run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT],
    temporalize_toolset_func: Callable[[AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], AbstractAgent[AgentDepsT, Any] | None], AbstractToolset[AgentDepsT]] = temporalize_toolset,
)
```

Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities.

After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent.

###### Parameters

**`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

The agent to wrap.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used.

**`models`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `Model`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional mapping of model instances to register with the agent. Keys define the names that can be referenced at runtime and the values are `Model` instances. Registered model instances can be passed directly to `run(model=...)`. If the wrapped agent doesn't have a model set and none is provided to `run()`, the first model in this mapping will be used as the default.

**`provider_factory`** : `TemporalProviderFactory` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callable used when instantiating models from provider strings (those supplied at runtime). The callable receives the provider name and the current run context, allowing custom configuration such as injecting API keys stored on `deps`. Note: This factory is only used inside Temporal workflows. Outside workflows, model strings are resolved using the default provider behavior.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use instead of the one set on the wrapped agent.

**`activity_config`** : `ActivityConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used.

**`model_activity_config`** : `ActivityConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Temporal activity config to use for model request activities. This is merged with the base activity config.

**`toolset_activity_config`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config.

**`tool_activity_config`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\]\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities.

**`run_context_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`TemporalRunContext`\[`AgentDepsT`\]\] _Default:_ `TemporalRunContext[AgentDepsT]`

The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute.

**`temporalize_toolset_func`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\], [`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig`, [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ActivityConfig` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\]\], [`type`](https://docs.python.org/3/glossary.html#term-type)\[`AgentDepsT`\], [`type`](https://docs.python.org/3/glossary.html#term-type)\[`TemporalRunContext`\[`AgentDepsT`\]\], `AbstractAgent`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)\], [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] _Default:_ `temporalize_toolset`

Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type.

##### run

`@async`

```python
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Run the agent with a user prompt in async mode.

This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    agent_run = await agent.run('What is the capital of France?')
    print(agent_run.output)
    #> The capital of France is Paris.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent. Inside workflows, only registered model instances, registered names, or provider strings are valid.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_sync

```python
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Synchronously run the agent with a user prompt.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.output)
#> The capital of Italy is Rome.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream

`@async`

```python
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]
```

Run the agent with a user prompt in async mode, returning a streamed response.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_output())
        #> The capital of the UK is London.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream\_events

```python
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[OutputDataT]
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[RunOutputDataT]
```

Run the agent with a user prompt in async mode and stream events from the run.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run.

Example:

```python
from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent

agent = Agent('openai:gpt-5.2')

async def main():
    events: list[AgentStreamEvent | AgentRunResultEvent] = []
    async with agent.run_stream_events('What is the capital of France?') as stream:
        async for event in stream:
            events.append(event)
    print(events)
    '''
    [
        PartStartEvent(index=0, part=TextPart(content='The capital of ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')),
        PartEndEvent(
            index=0, part=TextPart(content='The capital of France is Paris. ')
        ),
        AgentRunResultEvent(
            result=AgentRunResult(output='The capital of France is Paris. ')
        ),
    ]
    '''
```

Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed.

###### Returns

`AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final `AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- run result.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### iter

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
    _deprecated_kwargs: Never = {},
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
    _deprecated_kwargs: Never = {},
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### override

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply as overrides.

### TemporalWrapperToolset

**Bases:** `WrapperToolset[AgentDepsT]`, `ABC`

### PydanticAIPlugin

**Bases:** `SimplePlugin`

Temporal client and worker plugin for Pydantic AI.

### AgentPlugin

**Bases:** `SimplePlugin`

Temporal worker plugin for a specific Pydantic AI agent.

### StepConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for a step in the DBOS workflow.

### DBOSMCPServer

**Bases:** `DBOSMCPToolset[AgentDepsT]`

A wrapper for MCPServer that integrates with DBOS, turning call\_tool and get\_tools into DBOS steps.

Tool definitions are cached across steps to avoid redundant MCP server round-trips, respecting the wrapped server's `cache_tools` setting.

### DBOSModel

**Bases:** `WrapperModel`

A wrapper for Model that integrates with DBOS, turning request and request\_stream to DBOS steps.

### DBOSAgent

**Bases:** `WrapperAgent[AgentDepsT, OutputDataT]`, `DBOSConfiguredInstance`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    wrapped: AbstractAgent[AgentDepsT, OutputDataT],
    name: str | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    mcp_step_config: StepConfig | None = None,
    model_step_config: StepConfig | None = None,
    parallel_execution_mode: DBOSParallelExecutionMode = 'parallel_ordered_events',
)
```

Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps.

After wrapping, the original agent can still be used as normal outside of the DBOS workflow.

###### Parameters

**`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

The agent to wrap.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use instead of the one set on the wrapped agent.

**`mcp_step_config`** : `StepConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS.

**`model_step_config`** : `StepConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS.

**`parallel_execution_mode`** : `DBOSParallelExecutionMode` _Default:_ `'parallel_ordered_events'`

The mode for executing tool calls:

-   'parallel\_ordered\_events' (default): Run tool calls in parallel, but events are emitted in order, after all calls complete.
-   'sequential': Run tool calls one at a time in order.

##### run

`@async`

```python
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Run the agent with a user prompt in async mode.

This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    agent_run = await agent.run('What is the capital of France?')
    print(agent_run.output)
    #> The capital of France is Paris.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_sync

```python
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Synchronously run the agent with a user prompt.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.output)
#> The capital of Italy is Rome.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream

`@async`

```python
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]
```

Run the agent with a user prompt in async mode, returning a streamed response.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_output())
        #> The capital of the UK is London.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream\_events

```python
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[OutputDataT]
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[RunOutputDataT]
```

Run the agent with a user prompt in async mode and stream events from the run.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run.

Example:

```python
from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent

agent = Agent('openai:gpt-5.2')

async def main():
    events: list[AgentStreamEvent | AgentRunResultEvent] = []
    async with agent.run_stream_events('What is the capital of France?') as stream:
        async for event in stream:
            events.append(event)
    print(events)
    '''
    [
        PartStartEvent(index=0, part=TextPart(content='The capital of ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')),
        PartEndEvent(
            index=0, part=TextPart(content='The capital of France is Paris. ')
        ),
        AgentRunResultEvent(
            result=AgentRunResult(output='The capital of France is Paris. ')
        ),
    ]
    '''
```

Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed.

###### Returns

`AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final `AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- run result.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### iter

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
    _deprecated_kwargs: Never = {},
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
    _deprecated_kwargs: Never = {},
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### override

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply as overrides.

### DBOSParallelExecutionMode

The mode for executing tool calls in DBOS durable workflows. This is a subset of the ParallelExecutionMode because 'parallel' cannot guarantee deterministic ordering.

**Default:** `Literal['sequential', 'parallel_ordered_events']`

### TaskConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for a task in Prefect.

These options are passed to the `@task` decorator.

#### Attributes

##### retries

Maximum number of retries for the task.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### retry\_delay\_seconds

Delay between retries in seconds. Can be a single value or a list for custom backoff.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`float`](https://docs.python.org/3/library/functions.html#float)\]

##### timeout\_seconds

Maximum time in seconds for the task to complete.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### cache\_policy

Prefect cache policy for the task.

**Type:** `CachePolicy`

##### persist\_result

Whether to persist the task result.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### result\_storage

Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`.

**Type:** `ResultStorage`

##### log\_prints

Whether to log print statements from the task.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

### PrefectFunctionToolset

**Bases:** `PrefectWrapperToolset[AgentDepsT]`

A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks.

#### Methods

##### call\_tool

`@async`

```python
def call_tool(
    name: str,
    tool_args: dict[str, Any],
    ctx: RunContext[AgentDepsT],
    tool: ToolsetTool[AgentDepsT],
) -> Any
```

Call a tool, wrapped as a Prefect task with a descriptive name.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

### PrefectMCPServer

**Bases:** `PrefectWrapperToolset[AgentDepsT]`, `ABC`

A wrapper for MCPServer that integrates with Prefect, turning call\_tool and get\_tools into Prefect tasks.

#### Methods

##### call\_tool

`@async`

```python
def call_tool(
    name: str,
    tool_args: dict[str, Any],
    ctx: RunContext[AgentDepsT],
    tool: ToolsetTool[AgentDepsT],
) -> ToolResult
```

Call an MCP tool, wrapped as a Prefect task with a descriptive name.

###### Returns

`ToolResult`

### PrefectModel

**Bases:** `WrapperModel`

A wrapper for Model that integrates with Prefect, turning request and request\_stream into Prefect tasks.

#### Methods

##### request

`@async`

```python
def request(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
) -> ModelResponse
```

Make a model request, wrapped as a Prefect task when in a flow.

###### Returns

[`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

##### request\_stream

`@async`

```python
def request_stream(
    messages: list[ModelMessage],
    model_settings: ModelSettings | None,
    model_request_parameters: ModelRequestParameters,
    run_context: RunContext[Any] | None = None,
) -> AsyncIterator[StreamedResponse]
```

Make a streaming model request.

When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedResponse`\]

### PrefectAgent

**Bases:** `WrapperAgent[AgentDepsT, OutputDataT]`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    wrapped: AbstractAgent[AgentDepsT, OutputDataT],
    name: str | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    mcp_task_config: TaskConfig | None = None,
    model_task_config: TaskConfig | None = None,
    tool_task_config: TaskConfig | None = None,
    tool_task_config_by_name: dict[str, TaskConfig | None] | None = None,
    event_stream_handler_task_config: TaskConfig | None = None,
    prefectify_toolset_func: Callable[[AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT]] = prefectify_toolset,
)
```

Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks.

After wrapping, the original agent can still be used as normal outside of the Prefect flow.

###### Parameters

**`wrapped`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

The agent to wrap.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use instead of the one set on the wrapped agent.

**`mcp_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect.

**`model_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect.

**`tool_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect.

**`tool_task_config_by_name`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool).

**`event_stream_handler_task_config`** : `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect.

**`prefectify_toolset_func`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\], `TaskConfig`, `TaskConfig`, [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `TaskConfig` | [`None`](https://docs.python.org/3/library/constants.html#None)\]\], [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] _Default:_ `prefectify_toolset`

Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name.

##### run

`@async`

```python
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Run the agent with a user prompt in async mode.

This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    agent_run = await agent.run('What is the capital of France?')
    print(agent_run.output)
    #> The capital of France is Paris.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_sync

```python
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[OutputDataT]
def run_sync(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentRunResult[RunOutputDataT]
```

Synchronously run the agent with a user prompt.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.output)
#> The capital of Italy is Rome.
```

###### Returns

[`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream

`@async`

```python
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]
def run_stream(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    event_stream_handler: EventStreamHandler[AgentDepsT] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]
```

Run the agent with a user prompt in async mode, returning a streamed response.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_output())
        #> The capital of the UK is London.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`StreamedRunResult`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`event_stream_handler`** : `EventStreamHandler`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### run\_stream\_events

```python
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[OutputDataT]
def run_stream_events(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AgentEventStream[RunOutputDataT]
```

Run the agent with a user prompt in async mode and stream events from the run.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and uses the `event_stream_handler` kwarg to get a stream of events from the run.

Example:

```python
from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent

agent = Agent('openai:gpt-5.2')

async def main():
    events: list[AgentStreamEvent | AgentRunResultEvent] = []
    async with agent.run_stream_events('What is the capital of France?') as stream:
        async for event in stream:
            events.append(event)
    print(events)
    '''
    [
        PartStartEvent(index=0, part=TextPart(content='The capital of ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')),
        PartEndEvent(
            index=0, part=TextPart(content='The capital of France is Paris. ')
        ),
        AgentRunResultEvent(
            result=AgentRunResult(output='The capital of France is Paris. ')
        ),
    ]
    '''
```

Arguments are the same as for [`self.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), except that `event_stream_handler` is now allowed.

###### Returns

`AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final `AgentEventStream`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- run result.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### iter

`@async`

```python
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: None = None,
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]
def iter(
    user_prompt: str | Sequence[_messages.UserContent] | None = None,
    output_type: OutputSpec[RunOutputDataT],
    message_history: Sequence[_messages.ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: models.Model | models.KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: AgentModelSettings[AgentDepsT] | None = None,
    usage_limits: _usage.UsageLimits | None = None,
    usage: _usage.RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    output_retries: int | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | None = None,
    capabilities: Sequence[AgentCapability[AgentDepsT]] | None = None,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]
```

A contextmanager which can be used to iterate over the agent graph's nodes as they are executed.

This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools.

The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed.

For more details, see the documentation of `AgentRun`.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun)\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] -- The result of the run.

###### Parameters

**`user_prompt`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

User input to start/continue the conversation.

**`output_type`** : `OutputSpec`\[`RunOutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.ModelMessage`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : `_usage.UsageLimits` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : `_usage.RunUsage` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override the agent-level `output_retries` for this run. See [`Agent.__init__`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) for semantics of the two enforcement paths.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`capabilities`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AgentCapability)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional [capabilities](https://ai.pydantic.dev/capabilities/) for this run, merged with the agent's configured capabilities.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply for this run.

##### override

```python
def override(
    name: str | _utils.Unset = _utils.UNSET,
    deps: AgentDepsT | _utils.Unset = _utils.UNSET,
    model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET,
    builtin_tools: Sequence[AgentBuiltinTool[AgentDepsT]] | _utils.Unset = _utils.UNSET,
    instructions: _instructions.AgentInstructions[AgentDepsT] | _utils.Unset = _utils.UNSET,
    model_settings: AgentModelSettings[AgentDepsT] | _utils.Unset = _utils.UNSET,
    output_retries: int | _utils.Unset = _utils.UNSET,
    spec: dict[str, Any] | AgentSpec | None = None,
) -> Iterator[None]
```

Context manager to temporarily override agent configuration.

This is particularly useful when testing. You can find an example of this [here](/docs/ai/guides/testing#overriding-model-via-pytest-fixtures).

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The name to use instead of the name passed to the agent constructor and agent run.

**`deps`** : `AgentDepsT` | `_utils.Unset` _Default:_ `_utils.UNSET`

The dependencies to use instead of the dependencies passed to the agent run.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The model to use instead of the model passed to the agent run.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The toolsets to use instead of the toolsets passed to the agent constructor and agent run.

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The tools to use instead of the tools registered with the agent.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)\[`AgentDepsT`\]\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The builtin tools to use instead of the agent's configured builtin tools. Per-run `builtin_tools` are still added to these.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The instructions to use instead of the instructions registered with the agent.

**`model_settings`** : `AgentModelSettings`\[`AgentDepsT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The model settings to use instead of the model settings passed to the agent constructor. When set, any per-run `model_settings` argument is ignored.

**`output_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | `_utils.Unset` _Default:_ `_utils.UNSET`

The output-retry budget to use instead of the agent-level `output_retries`. When set, any per-run `output_retries` argument is ignored.

**`spec`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `AgentSpec` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional agent spec to apply as overrides.

---

# [pydantic_ai.embeddings](https://pydantic.dev/docs/ai/api/pydantic-ai/embeddings/)

# pydantic\_ai.embeddings

### EmbeddingSettings

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Common settings for configuring embedding models.

These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details.

Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings), [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings)) extend this with additional provider-prefixed options.

#### Attributes

##### dimensions

The number of dimensions for the output embeddings.

Supported by:

-   OpenAI
-   Cohere
-   Google
-   Sentence Transformers
-   Bedrock
-   VoyageAI

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### truncate

Whether to truncate inputs that exceed the model's context length.

Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length.

For more control over truncation, you can use [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens) and [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens) to implement your own truncation logic.

Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified.

Supported by:

-   Cohere
-   Bedrock (Cohere and Nova models)
-   VoyageAI

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### extra\_headers

Extra headers to send to the model.

Supported by:

-   OpenAI
-   Cohere

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### extra\_body

Extra body to send to the model.

Supported by:

-   OpenAI
-   Cohere

**Type:** [`object`](https://docs.python.org/3/glossary.html#term-object)

### EmbeddingModel

**Bases:** `ABC`

Abstract base class for embedding models.

Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations:

-   [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel)
-   [`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel)
-   [`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel)
-   [`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel)
-   [`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel)

#### Attributes

##### settings

Get the default settings for this model.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### base\_url

The base URL for the provider API, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### model\_name

The name of the embedding model.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The embedding model provider/system identifier (e.g., 'openai', 'cohere').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(settings: EmbeddingSettings | None = None) -> None
```

Initialize the model with optional settings.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### embed

`@abstractmethod`

`@async`

```python
def embed(
    inputs: str | Sequence[str],
    input_type: EmbedInputType,
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Generate embeddings for the given inputs.

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- the embeddings and metadata.

###### Parameters

**`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single string or sequence of strings to embed.

**`input_type`** : `EmbedInputType`

Whether the inputs are queries or documents.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to override the model's defaults.

##### prepare\_embed

```python
def prepare_embed(
    inputs: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> tuple[list[str], EmbeddingSettings]
```

Prepare the inputs and settings for embedding.

This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation.

###### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\], [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)\] -- A tuple of (normalized inputs list, merged settings).

###### Parameters

**`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single string or sequence of strings.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to merge with defaults.

##### max\_input\_tokens

`@async`

```python
def max_input_tokens() -> int | None
```

Get the maximum number of tokens that can be input to the model.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if unknown.

##### count\_tokens

`@async`

```python
def count_tokens(text: str) -> int
```

Count the number of tokens in the given text.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens.

###### Parameters

**`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The text to tokenize and count.

###### Raises

-   `NotImplementedError` -- If the model doesn't support token counting.
-   `UserError` -- If the model or tokenizer is not supported.

### WrapperEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Base class for embedding models that wrap another model.

Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model.

By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior.

#### Attributes

##### wrapped

The underlying embedding model being wrapped.

**Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) **Default:** `infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped`

##### settings

Get the settings from the wrapped embedding model.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_init\_\_

```python
def __init__(wrapped: EmbeddingModel | str)
```

Initialize the wrapper with an embedding model.

###### Parameters

**`wrapped`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to wrap. Can be an [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance or a model name string (e.g., `'openai:text-embedding-3-small'`).

### EmbeddingResult

The result of an embedding operation.

This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing.

Example:

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    result = await embedder.embed_query('What is AI?')

    # Access embeddings by index
    print(len(result.embeddings[0]))
    #> 1536

    # Access embeddings by original input text
    print(result['What is AI?'] == result.embeddings[0])
    #> True

    # Check usage
    print(f'Tokens used: {result.usage.input_tokens}')
    #> Tokens used: 3
```

#### Attributes

##### embeddings

The computed embedding vectors, one per input text.

Each embedding is a sequence of floats representing the text in vector space.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\]\]

##### inputs

The original input texts that were embedded.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### input\_type

Whether the inputs were embedded as queries or documents.

**Type:** `EmbedInputType`

##### model\_name

The name of the model that generated these embeddings.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

The name of the provider (e.g., 'openai', 'cohere').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

When the embedding request was made.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### usage

Token usage statistics for this request.

**Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)`

##### provider\_details

Provider-specific details from the response.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_response\_id

Unique identifier for this response from the provider, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### \_\_getitem\_\_

```python
def __getitem__(item: int | str) -> Sequence[float]
```

Get the embedding for an input by index or by the original input text.

###### Returns

[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\] -- The embedding vector for the specified input.

###### Parameters

**`item`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Either an integer index or the original input string.

###### Raises

-   `IndexError` -- If the index is out of range.
-   `ValueError` -- If the string is not found in the inputs.

##### cost

```python
def cost() -> genai_types.PriceCalculation
```

Calculate the cost of the embedding request.

Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data.

###### Returns

`genai_types.PriceCalculation` -- A price calculation object with `total_price`, `input_price`, and other cost details.

###### Raises

-   `LookupError` -- If pricing data is not available for this model/provider.

### TestEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

A mock embedding model for testing.

This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute.

Example:

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import TestEmbeddingModel

test_model = TestEmbeddingModel()
embedder = Embedder('openai:text-embedding-3-small')


async def main():
    with embedder.override(model=test_model):
        await embedder.embed_query('test')
        assert test_model.last_settings is not None
```

#### Attributes

##### last\_settings

The settings used in the most recent embed call.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### model\_name

The embedding model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: str = 'test',
    provider_name: str = 'test',
    dimensions: int = 8,
    settings: EmbeddingSettings | None = None,
)
```

Initialize the test embedding model.

###### Parameters

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'`

The model name to report in results.

**`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'`

The provider name to report in results.

**`dimensions`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `8`

The number of dimensions for the generated embeddings.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional default settings for the model.

### InstrumentedEmbeddingModel

**Bases:** `WrapperEmbeddingModel`

Embedding model which wraps another model so that requests are instrumented with OpenTelemetry.

See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

#### Attributes

##### instrumentation\_settings

Instrumentation settings for this model.

**Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()`

### Embedder

High-level interface for generating text embeddings.

The `Embedder` class provides a convenient way to generate vector embeddings from text using various embedding model providers. It handles model inference, settings management, and optional OpenTelemetry instrumentation.

Example:

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    result = await embedder.embed_query('What is machine learning?')
    print(result.embeddings[0][:5])  # First 5 dimensions
    #> [1.0, 1.0, 1.0, 1.0, 1.0]
```

#### Attributes

##### instrument

Options to automatically instrument with OpenTelemetry.

Set to `True` to use default instrumentation settings, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) to customize. If this isn't set, then the last value set by [`Embedder.instrument_all()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.instrument_all) will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

**Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `instrument`

##### model

The embedding model used by this embedder.

**Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model: EmbeddingModel | KnownEmbeddingModelName | str,
    settings: EmbeddingSettings | None = None,
    defer_model_check: bool = True,
    instrument: InstrumentationSettings | bool | None = None,
) -> None
```

Initialize an Embedder.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`model`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The embedding model to use. Can be specified as:

-   A model name string in the format `'provider:model-name'` (e.g., `'openai:text-embedding-3-small'`)
-   An [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for all embed calls.

**`defer_model_check`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to defer model validation until first use. Set to `False` to validate the model immediately on construction.

**`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

OpenTelemetry instrumentation settings. Set to `True` to enable with defaults, or pass an [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) instance to customize. If `None`, uses the value from [`Embedder.instrument_all()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.instrument_all).

##### instrument\_all

`@staticmethod`

```python
def instrument_all(instrument: InstrumentationSettings | bool = True) -> None
```

Set the default instrumentation options for all embedders where `instrument` is not explicitly set.

This is useful for enabling instrumentation globally without modifying each embedder individually.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`instrument`** : [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) | [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Instrumentation settings to use as the default. Set to `True` for default settings, `False` to disable, or pass an [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) instance to customize.

##### override

```python
def override(
    model: EmbeddingModel | KnownEmbeddingModelName | str | _utils.Unset = _utils.UNSET,
) -> Iterator[None]
```

Context manager to temporarily override the embedding model.

Useful for testing or dynamically switching models.

Example:

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    # Temporarily use a different model
    with embedder.override(model='openai:text-embedding-3-large'):
        result = await embedder.embed_query('test')
        print(len(result.embeddings[0]))  # 3072 dimensions for large model
        #> 3072
```

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

###### Parameters

**`model`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | `KnownEmbeddingModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `_utils.Unset` _Default:_ `_utils.UNSET`

The embedding model to use within this context.

##### embed\_query

`@async`

```python
def embed_query(
    query: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Embed one or more query texts.

Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document.

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation.

###### Parameters

**`query`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single query string or sequence of query strings to embed.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to override the embedder's default settings for this call.

##### embed\_documents

`@async`

```python
def embed_documents(
    documents: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Embed one or more document texts.

Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document.

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation.

###### Parameters

**`documents`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single document string or sequence of document strings to embed.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to override the embedder's default settings for this call.

##### embed

`@async`

```python
def embed(
    inputs: str | Sequence[str],
    input_type: EmbedInputType,
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Embed text inputs with explicit input type specification.

This is the low-level embedding method. For most use cases, prefer [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query) or [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents).

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- and metadata about the operation.

###### Parameters

**`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single string or sequence of strings to embed.

**`input_type`** : `EmbedInputType`

The type of input, either `'query'` or `'document'`.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to override the embedder's default settings for this call.

##### max\_input\_tokens

`@async`

```python
def max_input_tokens() -> int | None
```

Get the maximum number of tokens the model can accept as input.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if the limit is unknown for this model.

##### count\_tokens

`@async`

```python
def count_tokens(text: str) -> int
```

Count the number of tokens in the given text.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens in the text.

###### Parameters

**`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The text to tokenize and count.

###### Raises

-   `NotImplementedError` -- If the model doesn't support token counting.
-   `UserError` -- If the model or tokenizer is not supported.

##### embed\_query\_sync

```python
def embed_query_sync(
    query: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Synchronous version of [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query).

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult)

##### embed\_documents\_sync

```python
def embed_documents_sync(
    documents: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Synchronous version of [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents).

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult)

##### embed\_sync

```python
def embed_sync(
    inputs: str | Sequence[str],
    input_type: EmbedInputType,
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Synchronous version of [`embed()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed).

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult)

##### max\_input\_tokens\_sync

```python
def max_input_tokens_sync() -> int | None
```

Synchronous version of [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens).

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### count\_tokens\_sync

```python
def count_tokens_sync(text: str) -> int
```

Synchronous version of [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens).

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int)

### instrument\_embedding\_model

```python
def instrument_embedding_model(
    model: EmbeddingModel,
    instrument: InstrumentationSettings | bool,
) -> EmbeddingModel
```

Instrument an embedding model with OpenTelemetry/logfire.

#### Returns

[`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

### merge\_embedding\_settings

```python
def merge_embedding_settings(
    base: EmbeddingSettings | None,
    overrides: EmbeddingSettings | None,
) -> EmbeddingSettings | None
```

Merge two sets of embedding settings, with overrides taking precedence.

#### Returns

[`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) -- Merged settings, or `None` if both inputs are `None`.

#### Parameters

**`base`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

Base settings (typically from the embedder or model).

**`overrides`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

Settings that should override the base (typically per-call settings).

### infer\_embedding\_model

```python
def infer_embedding_model(
    model: EmbeddingModel | KnownEmbeddingModelName | str,
    provider_factory: Callable[[str], Provider[Any]] = infer_provider,
) -> EmbeddingModel
```

Infer the model from the name.

#### Returns

[`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

### KnownEmbeddingModelName

Known model names that can be used with the `model` parameter of [`Embedder`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder).

`KnownEmbeddingModelName` is provided as a concise way to specify an embedding model.

**Default:** `TypeAliasType('KnownEmbeddingModelName', Literal['google-gla:gemini-embedding-001', 'google-gla:gemini-embedding-2-preview', 'google-vertex:gemini-embedding-001', 'google-vertex:gemini-embedding-2-preview', 'google-vertex:text-embedding-005', 'google-vertex:text-multilingual-embedding-002', 'openai:text-embedding-ada-002', 'openai:text-embedding-3-small', 'openai:text-embedding-3-large', 'cohere:embed-v4.0', 'cohere:embed-english-v3.0', 'cohere:embed-english-light-v3.0', 'cohere:embed-multilingual-v3.0', 'cohere:embed-multilingual-light-v3.0', 'voyageai:voyage-4-large', 'voyageai:voyage-4', 'voyageai:voyage-4-lite', 'voyageai:voyage-3-large', 'voyageai:voyage-3.5', 'voyageai:voyage-3.5-lite', 'voyageai:voyage-code-3', 'voyageai:voyage-finance-2', 'voyageai:voyage-law-2', 'voyageai:voyage-code-2', 'bedrock:amazon.titan-embed-text-v1', 'bedrock:amazon.titan-embed-text-v2:0', 'bedrock:cohere.embed-english-v3', 'bedrock:cohere.embed-multilingual-v3', 'bedrock:cohere.embed-v4:0', 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0'])`

### EmbeddingModel

**Bases:** `ABC`

Abstract base class for embedding models.

Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations:

-   [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel)
-   [`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel)
-   [`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel)
-   [`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel)
-   [`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel)

#### Attributes

##### settings

Get the default settings for this model.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### base\_url

The base URL for the provider API, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### model\_name

The name of the embedding model.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The embedding model provider/system identifier (e.g., 'openai', 'cohere').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(settings: EmbeddingSettings | None = None) -> None
```

Initialize the model with optional settings.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific settings that will be used as defaults for this model.

##### embed

`@abstractmethod`

`@async`

```python
def embed(
    inputs: str | Sequence[str],
    input_type: EmbedInputType,
    settings: EmbeddingSettings | None = None,
) -> EmbeddingResult
```

Generate embeddings for the given inputs.

###### Returns

[`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- An [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) -- the embeddings and metadata.

###### Parameters

**`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single string or sequence of strings to embed.

**`input_type`** : `EmbedInputType`

Whether the inputs are queries or documents.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to override the model's defaults.

##### prepare\_embed

```python
def prepare_embed(
    inputs: str | Sequence[str],
    settings: EmbeddingSettings | None = None,
) -> tuple[list[str], EmbeddingSettings]
```

Prepare the inputs and settings for embedding.

This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation.

###### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\], [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)\] -- A tuple of (normalized inputs list, merged settings).

###### Parameters

**`inputs`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

A single string or sequence of strings.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to merge with defaults.

##### max\_input\_tokens

`@async`

```python
def max_input_tokens() -> int | None
```

Get the maximum number of tokens that can be input to the model.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) -- The maximum token count, or `None` if unknown.

##### count\_tokens

`@async`

```python
def count_tokens(text: str) -> int
```

Count the number of tokens in the given text.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) -- The number of tokens.

###### Parameters

**`text`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The text to tokenize and count.

###### Raises

-   `NotImplementedError` -- If the model doesn't support token counting.
-   `UserError` -- If the model or tokenizer is not supported.

### EmbeddingResult

The result of an embedding operation.

This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing.

Example:

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    result = await embedder.embed_query('What is AI?')

    # Access embeddings by index
    print(len(result.embeddings[0]))
    #> 1536

    # Access embeddings by original input text
    print(result['What is AI?'] == result.embeddings[0])
    #> True

    # Check usage
    print(f'Tokens used: {result.usage.input_tokens}')
    #> Tokens used: 3
```

#### Attributes

##### embeddings

The computed embedding vectors, one per input text.

Each embedding is a sequence of floats representing the text in vector space.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\]\]

##### inputs

The original input texts that were embedded.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### input\_type

Whether the inputs were embedded as queries or documents.

**Type:** `EmbedInputType`

##### model\_name

The name of the model that generated these embeddings.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

The name of the provider (e.g., 'openai', 'cohere').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

When the embedding request was made.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### usage

Token usage statistics for this request.

**Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)`

##### provider\_details

Provider-specific details from the response.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_response\_id

Unique identifier for this response from the provider, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### \_\_getitem\_\_

```python
def __getitem__(item: int | str) -> Sequence[float]
```

Get the embedding for an input by index or by the original input text.

###### Returns

[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`float`](https://docs.python.org/3/library/functions.html#float)\] -- The embedding vector for the specified input.

###### Parameters

**`item`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Either an integer index or the original input string.

###### Raises

-   `IndexError` -- If the index is out of range.
-   `ValueError` -- If the string is not found in the inputs.

##### cost

```python
def cost() -> genai_types.PriceCalculation
```

Calculate the cost of the embedding request.

Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data.

###### Returns

`genai_types.PriceCalculation` -- A price calculation object with `total_price`, `input_price`, and other cost details.

###### Raises

-   `LookupError` -- If pricing data is not available for this model/provider.

### EmbedInputType

The type of input to the embedding model.

-   `'query'`: Text that will be used as a search query
-   `'document'`: Text that will be stored and searched against

Some embedding models optimize differently for queries vs documents.

**Default:** `Literal['query', 'document']`

### EmbeddingSettings

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Common settings for configuring embedding models.

These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details.

Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings), [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings)) extend this with additional provider-prefixed options.

#### Attributes

##### dimensions

The number of dimensions for the output embeddings.

Supported by:

-   OpenAI
-   Cohere
-   Google
-   Sentence Transformers
-   Bedrock
-   VoyageAI

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### truncate

Whether to truncate inputs that exceed the model's context length.

Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length.

For more control over truncation, you can use [`max_input_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.max_input_tokens) and [`count_tokens()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.count_tokens) to implement your own truncation logic.

Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified.

Supported by:

-   Cohere
-   Bedrock (Cohere and Nova models)
-   VoyageAI

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### extra\_headers

Extra headers to send to the model.

Supported by:

-   OpenAI
-   Cohere

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### extra\_body

Extra body to send to the model.

Supported by:

-   OpenAI
-   Cohere

**Type:** [`object`](https://docs.python.org/3/glossary.html#term-object)

### merge\_embedding\_settings

```python
def merge_embedding_settings(
    base: EmbeddingSettings | None,
    overrides: EmbeddingSettings | None,
) -> EmbeddingSettings | None
```

Merge two sets of embedding settings, with overrides taking precedence.

#### Returns

[`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) -- Merged settings, or `None` if both inputs are `None`.

#### Parameters

**`base`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

Base settings (typically from the embedder or model).

**`overrides`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

Settings that should override the base (typically per-call settings).

### OpenAIEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for an OpenAI embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported.

### OpenAIEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

OpenAI embedding model implementation.

This model works with OpenAI's embeddings API and any [OpenAI-compatible providers](/docs/ai/models/openai#openai-compatible-models).

Example:

```python
from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel
from pydantic_ai.providers.openai import OpenAIProvider

# Using OpenAI directly
model = OpenAIEmbeddingModel('text-embedding-3-small')

# Using an OpenAI-compatible provider
model = OpenAIEmbeddingModel(
    'text-embedding-3-small',
    provider=OpenAIProvider(base_url='https://my-provider.com/v1'),
)
```

#### Attributes

##### model\_name

The embedding model name.

**Type:** `OpenAIEmbeddingModelName`

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: OpenAIEmbeddingModelName,
    provider: OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI] = 'openai',
    settings: EmbeddingSettings | None = None,
)
```

Initialize an OpenAI embedding model.

###### Parameters

**`model_name`** : `OpenAIEmbeddingModelName`

The name of the OpenAI model to use. See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available options.

**`provider`** : `OpenAIEmbeddingsCompatibleProvider` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['openai'\] | `Provider`\[`AsyncOpenAI`\] _Default:_ `'openai'`

The provider to use for authentication and API access. Can be:

-   `'openai'` (default): Uses the standard OpenAI API
-   A provider name string (e.g., `'azure'`, `'deepseek'`)
-   A [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) instance for custom configuration

See [OpenAI-compatible providers](/docs/ai/models/openai#openai-compatible-models) for a list of supported providers.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model.

### OpenAIEmbeddingModelName

Possible OpenAI embeddings model names.

See the [OpenAI embeddings documentation](https://platform.openai.com/docs/guides/embeddings) for available models.

**Default:** `str | LatestOpenAIEmbeddingModelNames`

### CohereEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for a Cohere embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Cohere-specific settings prefixed with `cohere_`.

#### Attributes

##### cohere\_max\_tokens

The maximum number of tokens to embed.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### cohere\_input\_type

The Cohere-specific input type for the embedding.

Overrides the standard `input_type` argument. Options include: `'search_query'`, `'search_document'`, `'classification'`, `'clustering'`, and `'image'`.

**Type:** `CohereEmbedInputType`

##### cohere\_truncate

The truncation strategy to use:

-   `'NONE'` (default): Raise an error if input exceeds max tokens.
-   `'END'`: Truncate the end of the input text.
-   `'START'`: Truncate the start of the input text.

Note: This setting overrides the standard `truncate` boolean setting when specified.

**Type:** `V2EmbedRequestTruncate`

### CohereEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Cohere embedding model implementation.

This model works with Cohere's embeddings API, which offers multilingual support and various model sizes.

Example:

```python
from pydantic_ai.embeddings.cohere import CohereEmbeddingModel

model = CohereEmbeddingModel('embed-v4.0')
```

#### Attributes

##### base\_url

The base URL for the provider API, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_name

The embedding model name.

**Type:** `CohereEmbeddingModelName`

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: CohereEmbeddingModelName,
    provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere',
    settings: EmbeddingSettings | None = None,
)
```

Initialize a Cohere embedding model.

###### Parameters

**`model_name`** : `CohereEmbeddingModelName`

The name of the Cohere model to use. See [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cohere'\] | `Provider`\[`AsyncClientV2`\] _Default:_ `'cohere'`

The provider to use for authentication and API access. Can be:

-   `'cohere'` (default): Uses the standard Cohere API
-   A [`CohereProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.cohere.CohereProvider) instance for custom configuration

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model.

### LatestCohereEmbeddingModelNames

Latest Cohere embeddings models.

See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models and their capabilities.

**Default:** `Literal['embed-v4.0', 'embed-english-v3.0', 'embed-english-light-v3.0', 'embed-multilingual-v3.0', 'embed-multilingual-light-v3.0']`

### CohereEmbeddingModelName

Possible Cohere embeddings model names.

**Default:** `str | LatestCohereEmbeddingModelNames`

### GoogleEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for a Google embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Google-specific settings prefixed with `google_`.

#### Attributes

##### google\_task\_type

The task type for the embedding.

Overrides the automatic task type selection based on `input_type`. See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available options.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### google\_title

Optional title for the content being embedded.

Only applicable when task\_type is `RETRIEVAL_DOCUMENT`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### GoogleEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Google embedding model implementation.

This model works with Google's embeddings API via the `google-genai` SDK, supporting both the Gemini API (Google AI Studio) and Vertex AI.

Example:

```python
from pydantic_ai.embeddings.google import GoogleEmbeddingModel
from pydantic_ai.providers.google import GoogleProvider

# Using Gemini API (requires GOOGLE_API_KEY env var)
model = GoogleEmbeddingModel('gemini-embedding-001')

# Using Vertex AI
model = GoogleEmbeddingModel(
    'gemini-embedding-001',
    provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'),
)
```

#### Attributes

##### model\_name

The embedding model name.

**Type:** `GoogleEmbeddingModelName`

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: GoogleEmbeddingModelName,
    provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla',
    settings: EmbeddingSettings | None = None,
)
```

Initialize a Google embedding model.

###### Parameters

**`model_name`** : `GoogleEmbeddingModelName`

The name of the Google model to use. See [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['google-gla', 'google-vertex'\] | `Provider`\[`Client`\] _Default:_ `'google-gla'`

The provider to use for authentication and API access. Can be:

-   `'google-gla'` (default): Uses the Gemini API (Google AI Studio)
-   `'google-vertex'`: Uses Vertex AI
-   A [`GoogleProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.google.GoogleProvider) instance for custom configuration

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model.

### LatestGoogleGLAEmbeddingModelNames

Latest Google Gemini API (GLA) embedding models.

See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models and their capabilities.

**Default:** `Literal['gemini-embedding-001', 'gemini-embedding-2-preview']`

### LatestGoogleVertexEmbeddingModelNames

Latest Google Vertex AI embedding models.

See the [Vertex AI Embeddings documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) for available models and their capabilities.

**Default:** `Literal['gemini-embedding-001', 'gemini-embedding-2-preview', 'text-embedding-005', 'text-multilingual-embedding-002']`

### LatestGoogleEmbeddingModelNames

All latest Google embedding models (union of GLA and Vertex AI models).

**Default:** `LatestGoogleGLAEmbeddingModelNames | LatestGoogleVertexEmbeddingModelNames`

### GoogleEmbeddingModelName

Possible Google embeddings model names.

**Default:** `str | LatestGoogleEmbeddingModelNames`

### BedrockEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for a Bedrock embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Bedrock-specific settings prefixed with `bedrock_`.

All settings are optional - if not specified, model defaults are used.

**Note on `dimensions` parameter support:**

-   **Titan v1** (`amazon.titan-embed-text-v1`): Not supported (fixed: 1536)
-   **Titan v2** (`amazon.titan-embed-text-v2:0`): Supported (default: 1024, accepts 256/384/1024)
-   **Cohere v3** (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`): Not supported (fixed: 1024)
-   **Cohere v4** (`cohere.embed-v4:0`): Supported (default: 1536, accepts 256/512/1024/1536)
-   **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: 3072, accepts 256/384/1024/3072)

Unsupported settings are silently ignored.

**Note on `truncate` parameter support:**

-   **Titan models** (`amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`): Not supported
-   **Cohere models** (all versions): Supported (default: `False`, maps to `'END'` when `True`)
-   **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: `False`, maps to `'END'` when `True`)

For fine-grained truncation control, use model-specific settings: `bedrock_cohere_truncate` or `bedrock_nova_truncate`.

#### Attributes

##### bedrock\_titan\_normalize

Whether to normalize embedding vectors for Titan models.

**Supported by:** `amazon.titan-embed-text-v2:0` (default: `True`)

**Not supported by:** `amazon.titan-embed-text-v1` (silently ignored)

When enabled, vectors are normalized for direct cosine similarity calculations.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### bedrock\_cohere\_max\_tokens

The maximum number of tokens to embed for Cohere models.

**Supported by:** `cohere.embed-v4:0` (default: 128000)

**Not supported by:** `cohere.embed-english-v3`, `cohere.embed-multilingual-v3` (silently ignored)

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### bedrock\_cohere\_input\_type

The input type for Cohere models.

**Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`)

By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['search\_document', 'search\_query', 'classification', 'clustering'\]

##### bedrock\_cohere\_truncate

The truncation strategy for Cohere models. Overrides base `truncate` setting.

**Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`)

Default: `'NONE'`

-   `'NONE'`: Raise an error if input exceeds max tokens.
-   `'START'`: Truncate the start of the input.
-   `'END'`: Truncate the end of the input.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['NONE', 'START', 'END'\]

##### bedrock\_nova\_truncate

The truncation strategy for Nova models. Overrides base `truncate` setting.

**Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0`

Default: `'NONE'`

-   `'NONE'`: Raise an error if input exceeds max tokens.
-   `'START'`: Truncate the start of the input.
-   `'END'`: Truncate the end of the input.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['NONE', 'START', 'END'\]

##### bedrock\_nova\_embedding\_purpose

The embedding purpose for Nova models.

**Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0`

By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`.

Note: Multimodal-specific purposes (`'IMAGE_RETRIEVAL'`, `'VIDEO_RETRIEVAL'`, `'DOCUMENT_RETRIEVAL'`, `'AUDIO_RETRIEVAL'`) are not supported as this embedding client only accepts text input.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['GENERIC\_INDEX', 'GENERIC\_RETRIEVAL', 'TEXT\_RETRIEVAL', 'CLASSIFICATION', 'CLUSTERING'\]

##### bedrock\_inference\_profile

An [inference profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html) ARN to use as the `modelId` in API requests.

When set, this value is used as the `modelId` in `invoke_model` API calls instead of the base `model_name`. This allows you to pass the base model name (e.g. `'amazon.titan-embed-text-v2:0'`) as `model_name` for detecting model capabilities, while routing requests through an inference profile for cost tracking or cross-region inference.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### bedrock\_max\_concurrency

Maximum number of concurrent requests for models that don't support batch embedding.

**Applies to:** `amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`, `amazon.nova-2-multimodal-embeddings-v1:0`

When embedding multiple texts with models that only support single-text requests, this controls how many requests run in parallel. Defaults to 5.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

### BedrockEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Bedrock embedding model implementation.

This model works with AWS Bedrock's embedding models including Amazon Titan Embeddings and Cohere Embed models.

Example:

```python
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel
from pydantic_ai.providers.bedrock import BedrockProvider

# Using default AWS credentials
model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0')

# Using explicit credentials
model = BedrockEmbeddingModel(
    'cohere.embed-english-v3',
    provider=BedrockProvider(
        region_name='us-east-1',
        aws_access_key_id='...',
        aws_secret_access_key='...',
    ),
)
```

#### Attributes

##### base\_url

The base URL for the provider API.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_name

The embedding model name.

**Type:** `BedrockEmbeddingModelName`

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: BedrockEmbeddingModelName,
    provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock',
    settings: EmbeddingSettings | None = None,
)
```

Initialize a Bedrock embedding model.

###### Parameters

**`model_name`** : `BedrockEmbeddingModelName`

The name of the Bedrock embedding model to use. See [Bedrock embedding models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available options.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['bedrock'\] | `Provider`\[`BaseClient`\] _Default:_ `'bedrock'`

The provider to use for authentication and API access. Can be:

-   `'bedrock'` (default): Uses default AWS credentials
-   A [`BedrockProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.bedrock.BedrockProvider) instance for custom configuration

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model.

##### max\_input\_tokens

`@async`

```python
def max_input_tokens() -> int | None
```

Get the maximum number of tokens that can be input to the model.

###### Returns

[`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None)

### LatestBedrockEmbeddingModelNames

Latest Bedrock embedding model names.

See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available embedding models.

**Default:** `Literal['amazon.titan-embed-text-v1', 'amazon.titan-embed-text-v2:0', 'cohere.embed-english-v3', 'cohere.embed-multilingual-v3', 'cohere.embed-v4:0', 'amazon.nova-2-multimodal-embeddings-v1:0']`

### BedrockEmbeddingModelName

Possible Bedrock embedding model names.

**Default:** `str | LatestBedrockEmbeddingModelNames`

### VoyageAIEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for a VoyageAI embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus VoyageAI-specific settings prefixed with `voyageai_`.

#### Attributes

##### voyageai\_input\_type

The VoyageAI-specific input type for the embedding.

Overrides the standard `input_type` argument. Options include: `'query'`, `'document'`, or `'none'` for direct embedding without prefix.

**Type:** `VoyageAIEmbedInputType`

### VoyageAIEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

VoyageAI embedding model implementation.

VoyageAI provides state-of-the-art embedding models optimized for retrieval, with specialized models for code, finance, and legal domains.

Example:

```python
from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel

model = VoyageAIEmbeddingModel('voyage-3.5')
```

#### Attributes

##### base\_url

The base URL for the provider API.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_name

The embedding model name.

**Type:** `VoyageAIEmbeddingModelName`

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: VoyageAIEmbeddingModelName,
    provider: Literal['voyageai'] | Provider[AsyncClient] = 'voyageai',
    settings: EmbeddingSettings | None = None,
)
```

Initialize a VoyageAI embedding model.

###### Parameters

**`model_name`** : `VoyageAIEmbeddingModelName`

The name of the VoyageAI model to use. See [VoyageAI models](https://docs.voyageai.com/docs/embeddings) for available options.

**`provider`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['voyageai'\] | `Provider`\[`AsyncClient`\] _Default:_ `'voyageai'`

The provider to use for authentication and API access. Can be:

-   `'voyageai'` (default): Uses the standard VoyageAI API
-   A [`VoyageAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.voyageai.VoyageAIProvider) instance for custom configuration

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) to use as defaults for this model.

### LatestVoyageAIEmbeddingModelNames

Latest VoyageAI embedding models.

See [VoyageAI Embeddings](https://docs.voyageai.com/docs/embeddings) for available models and their capabilities.

**Default:** `Literal['voyage-4-large', 'voyage-4', 'voyage-4-lite', 'voyage-3-large', 'voyage-3.5', 'voyage-3.5-lite', 'voyage-code-3', 'voyage-finance-2', 'voyage-law-2', 'voyage-code-2']`

### VoyageAIEmbeddingModelName

Possible VoyageAI embedding model names.

**Default:** `str | LatestVoyageAIEmbeddingModelNames`

### VoyageAIEmbedInputType

VoyageAI embedding input types.

-   `'query'`: For search queries; prepends retrieval-optimized prefix.
-   `'document'`: For documents; prepends document retrieval prefix.
-   `'none'`: Direct embedding without any prefix.

**Default:** `Literal['query', 'document', 'none']`

### SentenceTransformersEmbeddingSettings

**Bases:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings)

Settings used for a Sentence-Transformers embedding model request.

All fields from [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) are supported, plus Sentence-Transformers-specific settings prefixed with `sentence_transformers_`.

#### Attributes

##### sentence\_transformers\_device

Device to run inference on.

Examples: `'cpu'`, `'cuda'`, `'cuda:0'`, `'mps'` (Apple Silicon).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### sentence\_transformers\_normalize\_embeddings

Whether to L2-normalize embeddings.

When `True`, all embeddings will have unit length, which is useful for cosine similarity calculations.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### sentence\_transformers\_batch\_size

Batch size to use during encoding.

Larger batches may be faster but require more memory.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

### SentenceTransformerEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Local embedding model using the `sentence-transformers` library.

This model runs embeddings locally on your machine, which is useful for:

-   Privacy-sensitive applications where data shouldn't leave your infrastructure
-   Reducing API costs for high-volume embedding workloads
-   Offline or air-gapped environments

Models are downloaded from Hugging Face on first use. See the [Sentence-Transformers documentation](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) for available models.

Example:

```python
from sentence_transformers import SentenceTransformer

from pydantic_ai.embeddings.sentence_transformers import (
    SentenceTransformerEmbeddingModel,
)

# Using a model name (downloads from Hugging Face)
model = SentenceTransformerEmbeddingModel('sentence-transformers/all-MiniLM-L6-v2')

# Using an existing SentenceTransformer instance
st_model = SentenceTransformer('Qwen/Qwen3-Embedding-0.6B')
model = SentenceTransformerEmbeddingModel(st_model)
```

#### Attributes

##### base\_url

No base URL -- runs locally.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### model\_name

The embedding model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The embedding model provider/system identifier.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model: SentenceTransformer | str,
    settings: EmbeddingSettings | None = None,
) -> None
```

Initialize a Sentence-Transformers embedding model.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`model`** : `SentenceTransformer` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to use. Can be:

-   A model name from Hugging Face (e.g., `'sentence-transformers/all-MiniLM-L6-v2'`)
-   A local path to a saved model
-   An existing `SentenceTransformer` instance

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Model-specific [`SentenceTransformersEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformersEmbeddingSettings) to use as defaults for this model.

### TestEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

A mock embedding model for testing.

This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute.

Example:

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import TestEmbeddingModel

test_model = TestEmbeddingModel()
embedder = Embedder('openai:text-embedding-3-small')


async def main():
    with embedder.override(model=test_model):
        await embedder.embed_query('test')
        assert test_model.last_settings is not None
```

#### Attributes

##### last\_settings

The settings used in the most recent embed call.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### model\_name

The embedding model name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### system

The embedding model provider.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    model_name: str = 'test',
    provider_name: str = 'test',
    dimensions: int = 8,
    settings: EmbeddingSettings | None = None,
)
```

Initialize the test embedding model.

###### Parameters

**`model_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'`

The model name to report in results.

**`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'test'`

The provider name to report in results.

**`dimensions`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `8`

The number of dimensions for the generated embeddings.

**`settings`** : [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional default settings for the model.

### WrapperEmbeddingModel

**Bases:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

Base class for embedding models that wrap another model.

Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model.

By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior.

#### Attributes

##### wrapped

The underlying embedding model being wrapped.

**Type:** [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) **Default:** `infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped`

##### settings

Get the settings from the wrapped embedding model.

**Type:** [`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_init\_\_

```python
def __init__(wrapped: EmbeddingModel | str)
```

Initialize the wrapper with an embedding model.

###### Parameters

**`wrapped`** : [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The model to wrap. Can be an [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel) instance or a model name string (e.g., `'openai:text-embedding-3-small'`).

### InstrumentedEmbeddingModel

**Bases:** `WrapperEmbeddingModel`

Embedding model which wraps another model so that requests are instrumented with OpenTelemetry.

See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info.

#### Attributes

##### instrumentation\_settings

Instrumentation settings for this model.

**Type:** [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings) **Default:** `options or InstrumentationSettings()`

### instrument\_embedding\_model

```python
def instrument_embedding_model(
    model: EmbeddingModel,
    instrument: InstrumentationSettings | bool,
) -> EmbeddingModel
```

Instrument an embedding model with OpenTelemetry/logfire.

#### Returns

[`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel)

---

# [pydantic_ai.exceptions](https://pydantic.dev/docs/ai/api/pydantic-ai/exceptions/)

# pydantic\_ai.exceptions

### ModelRetry

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise to request a model retry.

Can be raised from tool functions, output validators, and capability hooks (such as `after_model_request`, `after_tool_execute`, etc.) to send a retry prompt back to the model asking it to try again.

#### Attributes

##### message

The message to return to the model.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

#### Methods

##### \_\_get\_pydantic\_core\_schema\_\_

`@classmethod`

```python
def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema
```

Pydantic core schema to allow `ModelRetry` to be (de)serialized.

###### Returns

`core_schema.CoreSchema`

### CallDeferred

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise when a tool call should be deferred.

See [tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information.

#### Constructor Parameters

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`.

### ApprovalRequired

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise when a tool call requires human-in-the-loop approval.

See [tools docs](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more information.

#### Constructor Parameters

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`.

### SkipModelRequest

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise in before/wrap model request hooks to skip the model call.

The provided response will be used instead of calling the model.

Note: when raised in `before_model_request`, any message history modifications made by earlier capabilities in that hook will not be persisted to the agent's message history, since the request preparation is aborted.

### SkipToolValidation

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise in before/wrap tool validate hooks to skip validation.

The provided args will be used as the validated arguments.

### SkipToolExecution

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception to raise in before/wrap tool execute hooks to skip execution.

The provided result will be used as the tool result.

### UserError

**Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError)

Error caused by a usage mistake by the application developer -- You!

#### Attributes

##### message

Description of the mistake.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### AgentRunError

**Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError)

Base class for errors occurring during an agent run.

#### Attributes

##### message

The error message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### UsageLimitExceeded

**Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError)

Error raised when a Model's usage exceeds the specified limits.

### ConcurrencyLimitExceeded

**Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError)

Error raised when the concurrency queue depth exceeds max\_queued.

### UnexpectedModelBehavior

**Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError)

Error caused by unexpected Model behavior, e.g. an unexpected response code.

#### Attributes

##### message

Description of the unexpected behavior.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

##### body

The body of the response, if available.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `json.dumps(json.loads(body), indent=2)`

### ContentFilterError

**Bases:** [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior)

Raised when content filtering is triggered by the model provider resulting in an empty response.

### ModelAPIError

**Bases:** [`AgentRunError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.AgentRunError)

Raised when a model provider API request fails.

#### Attributes

##### model\_name

The name of the model associated with the error.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `model_name`

### ModelHTTPError

**Bases:** [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError)

Raised when a model provider response has a status code of 4xx or 5xx.

#### Attributes

##### status\_code

The HTTP status code returned by the API.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `status_code`

##### body

The body of the response, if available.

**Type:** [`object`](https://docs.python.org/3/glossary.html#term-object) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `body`

### FallbackExceptionGroup

**Bases:** `ExceptionGroup[Any]`

A group of exceptions that can be raised when all fallback models fail.

### ToolRetryError

**Bases:** [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

Exception used to signal a `ToolRetry` message should be returned to the LLM.

### IncompleteToolCall

**Bases:** [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior)

Error raised when a model stops due to token limit while emitting a tool call.

---

# [pydantic_ai.ext](https://pydantic.dev/docs/ai/api/pydantic-ai/ext/)

# pydantic\_ai.ext

### LangChainToolset

**Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset)

A toolset that wraps LangChain tools.

### tool\_from\_langchain

```python
def tool_from_langchain(langchain_tool: LangChainTool) -> Tool
```

Creates a Pydantic AI tool proxy from a LangChain tool.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) -- A Pydantic AI tool that corresponds to the LangChain tool.

#### Parameters

**`langchain_tool`** : `LangChainTool`

The LangChain tool to wrap.

### ACIToolset

**Bases:** [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset)

A toolset that wraps ACI.dev tools.

### tool\_from\_aci

```python
def tool_from_aci(aci_function: str, linked_account_owner_id: str) -> Tool
```

Creates a Pydantic AI tool proxy from an ACI.dev function.

#### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) -- A Pydantic AI tool that corresponds to the ACI.dev tool.

#### Parameters

**`aci_function`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The ACI.dev function to wrap.

**`linked_account_owner_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The ACI user ID to execute the function on behalf of.

---

# [pydantic_ai.format_prompt](https://pydantic.dev/docs/ai/api/pydantic-ai/format_prompt/)

# pydantic\_ai.format\_prompt

### format\_as\_xml

```python
def format_as_xml(
    obj: Any,
    root_tag: str | None = None,
    item_tag: str = 'item',
    none_str: str = 'null',
    indent: str | None = '  ',
    include_field_info: Literal['once'] | bool = False,
) -> str
```

Format a Python object as XML.

This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc.

Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `Decimal`, `date`, `datetime`, `time`, `timedelta`, `UUID`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`.

Example:

format\_as\_xml\_example.py

```python
from pydantic_ai import format_as_xml

print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user'))
'''
<user>
  <name>John</name>
  <height>6</height>
  <weight>200</weight>
</user>
'''
```

#### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- XML representation of the object.

#### Parameters

**`obj`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

Python Object to serialize to XML.

**`root_tag`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Outer tag to wrap the XML in, use `None` to omit the outer tag.

**`item_tag`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'item'`

Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models.

**`none_str`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'null'`

String to use for `None` values.

**`indent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `' '`

Indentation string to use for pretty printing.

**`include_field_info`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['once'\] | [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to include attributes like Pydantic `Field` attributes and dataclasses `field()` `metadata` as XML attributes. In both cases the allowed `Field` attributes and `field()` metadata keys are `title` and `description`. If a field is repeated in the data (e.g. in a list) by setting `once` the attributes are included only in the first occurrence of an XML element relative to the same field.

---

# [pydantic_ai.function_signature](https://pydantic.dev/docs/ai/api/pydantic-ai/function_signature/)

# pydantic\_ai.function\_signature

Generate function signatures from functions and JSON schemas.

This module provides utilities to represent tool definitions as human-readable function signatures, which LLMs can understand more easily than raw JSON schemas. Used by code mode to present tools as callable functions.

### SimpleTypeExpr

A simple named type like `str`, `int`, `Any`, `None`.

### LiteralTypeExpr

A Literal type expression like `Literal['a', 'b']` or `Literal[42]`.

### GenericTypeExpr

A generic type expression like `list[User]`, `dict[str, User]`, `tuple[int, str]`.

### UnionTypeExpr

A union type expression like `User | None`, `str | int`.

### TypeFieldSignature

A single field in a TypedDict-style type definition.

#### Methods

##### \_\_str\_\_

```python
def __str__() -> str
```

Render this field as a line in a TypedDict class body.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### TypeSignature

A TypedDict-style class definition with named fields.

#### Attributes

##### display\_name

The type name, with tool-name prefix applied if rendering context is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### \_\_str\_\_

```python
def __str__() -> str
```

Return the type name (for use in type expressions like `def foo(x: User)`).

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### render\_definition

```python
def render_definition(
    owner_name: str | None = None,
    conflicting_type_names: frozenset[str] = frozenset(),
) -> str
```

Render the full TypedDict class definition.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

###### Parameters

**`owner_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The owning tool name, used to build prefixed type names for conflicting types (e.g. `get_user_Address`).

**`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset()`

Set of type names that need tool-name prefixes (from `get_conflicting_type_names`). Only effective when `owner_name` is also provided.

##### structurally\_equal

```python
def structurally_equal(other: TypeSignature) -> bool
```

Compare two TypeSignatures structurally, ignoring descriptions.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### FunctionParam

A single parameter in a function signature.

#### Methods

##### \_\_str\_\_

```python
def __str__() -> str
```

Render this parameter as a function parameter string.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### FunctionSignature

Function signature shape with referenced type definitions.

This class holds the structural data (params, return type, referenced types) needed to render a function signature. Name and description can be overridden at render time (e.g. from a `ToolDefinition`).

#### Attributes

##### params

Function parameters, all rendered as keyword-only (JSON schema doesn't distinguish positional/keyword).

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `FunctionParam`\] **Default:** `field(default_factory=(dict[str, FunctionParam]))`

##### return\_type

The return type expression.

**Type:** `TypeExpr`

##### referenced\_types

TypedDict class definitions needed by the signature.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`TypeSignature`\] **Default:** `field(default_factory=(list[TypeSignature]))`

##### is\_async

Whether the underlying function is async.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

#### Methods

##### render

```python
def render(
    body: str,
    name: str | None = None,
    description: str | None = None,
    is_async: bool | None = None,
    conflicting_type_names: frozenset[str] = frozenset(),
) -> str
```

Render the signature with a specific body.

Sets `_type_name_overrides` so that dedup-prefixed types resolve correctly during rendering.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

###### Parameters

**`body`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The function body (e.g. `'...'` or `'return await tool()'`).

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The function name (also used for dedup prefix resolution). Falls back to `self.name`.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional docstring to include. Falls back to `self.description`.

**`is_async`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override async rendering. If `None`, uses `self.is_async`.

**`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset()`

Set of type names that need tool-name prefixes (from `get_conflicting_type_names`).

##### from\_schema

`@classmethod`

```python
def from_schema(
    cls,
    name: str,
    parameters_schema: dict[str, Any],
    return_schema: dict[str, Any] | None = None,
) -> FunctionSignature
```

Build a FunctionSignature from JSON schemas.

`name` is stored on the resulting signature and also used for generating fallback type names (e.g. `GetUserAddress`) when the schema has no `title`.

Parameter and return schemas are processed independently -- each resolves `$ref`s against its own `$defs`. Name collisions between parameter and return types (e.g. both define a `User` `$def` with different structures) are handled by `get_conflicting_type_names` at a later stage.

###### Returns

`FunctionSignature`

##### get\_conflicting\_type\_names

`@staticmethod`

```python
def get_conflicting_type_names(signatures: list[FunctionSignature]) -> frozenset[str]
```

Identify TypedDict name conflicts across multiple tool signatures.

Each signature keeps all its referenced types (so it remains self-contained), but identical types (same name and structure) are unified to the same object instance.

Returns the set of type names that have conflicts (same name, different structure) and need tool-name prefixes at render time. Pass this set to `FunctionSignature.render(conflicting_type_names=...)`.

Use `collect_unique_referenced_types()` when rendering to emit each definition once.

###### Returns

[`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### collect\_unique\_referenced\_types

`@staticmethod`

```python
def collect_unique_referenced_types(
    signatures: list[FunctionSignature],
) -> list[TypeSignature]
```

Collect unique TypeSignature objects from signatures, deduplicating by identity.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`TypeSignature`\]

##### render\_type\_definitions

`@staticmethod`

```python
def render_type_definitions(
    signatures: list[FunctionSignature],
    conflicting_type_names: frozenset[str],
) -> list[str]
```

Render unique TypedDict definitions for a set of function signatures.

For types whose names conflict across signatures (as identified by `get_conflicting_type_names`), each definition is rendered with a tool-name prefix (e.g. `get_user_Address`).

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] -- A list of rendered TypedDict class definitions as strings.

###### Parameters

**`signatures`** : [`list`](https://docs.python.org/3/glossary.html#term-list)\[`FunctionSignature`\]

The function signatures (after `get_conflicting_type_names`).

**`conflicting_type_names`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

The set returned by `get_conflicting_type_names`.

### TypeExpr

A type expression node in the signature's type tree.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'TypeSignature | SimpleTypeExpr | LiteralTypeExpr | GenericTypeExpr | UnionTypeExpr'`

---

# [pydantic_ai.mcp](https://pydantic.dev/docs/ai/api/pydantic-ai/mcp/)

# pydantic\_ai.mcp

### MCPError

**Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError)

Raised when an MCP server returns an error response.

This exception wraps error responses from MCP servers, following the ErrorData schema from the MCP specification.

#### Attributes

##### message

The error message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

##### code

The error code returned by the server.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `code`

##### data

Additional information about the error, if provided by the server.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `data`

#### Methods

##### from\_mcp\_sdk

`@classmethod`

```python
def from_mcp_sdk(cls, error: mcp_exceptions.McpError) -> MCPError
```

Create an MCPError from an MCP SDK McpError.

###### Returns

`MCPError`

###### Parameters

**`error`** : `mcp_exceptions.McpError`

An McpError from the MCP SDK.

### ResourceAnnotations

Additional properties describing MCP entities.

See the [resource annotations in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations).

#### Attributes

##### audience

Intended audience for this entity.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`mcp_types.Role`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### priority

Priority level for this entity, ranging from 0.0 to 1.0.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`float`](https://docs.python.org/3/library/functions.html#float), `Field`(`ge`\=0.0, `le`\=1.0)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### from\_mcp\_sdk

`@classmethod`

```python
def from_mcp_sdk(cls, mcp_annotations: mcp_types.Annotations) -> ResourceAnnotations
```

Convert from MCP SDK Annotations to ResourceAnnotations.

###### Returns

`ResourceAnnotations`

###### Parameters

**`mcp_annotations`** : `mcp_types.Annotations`

The MCP SDK annotations object.

### BaseResource

**Bases:** `ABC`

Base class for MCP resources.

#### Attributes

##### name

The programmatic name of the resource.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### title

Human-readable title for UI contexts.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### description

A description of what this resource represents.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### mime\_type

The MIME type of the resource, if known.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### annotations

Optional annotations for the resource.

**Type:** `ResourceAnnotations` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### metadata

Optional metadata for the resource.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### Resource

**Bases:** `BaseResource`

A resource that can be read from an MCP server.

See the [resources in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources).

#### Attributes

##### uri

The URI of the resource.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### size

The size of the raw resource content in bytes (before base64 encoding), if known.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### from\_mcp\_sdk

`@classmethod`

```python
def from_mcp_sdk(cls, mcp_resource: mcp_types.Resource) -> Resource
```

Convert from MCP SDK Resource to PydanticAI Resource.

###### Returns

`Resource`

###### Parameters

**`mcp_resource`** : `mcp_types.Resource`

The MCP SDK Resource object.

### ResourceTemplate

**Bases:** `BaseResource`

A template for parameterized resources on an MCP server.

See the [resource templates in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#resource-templates).

#### Attributes

##### uri\_template

URI template (RFC 6570) for constructing resource URIs.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### from\_mcp\_sdk

`@classmethod`

```python
def from_mcp_sdk(cls, mcp_template: mcp_types.ResourceTemplate) -> ResourceTemplate
```

Convert from MCP SDK ResourceTemplate to PydanticAI ResourceTemplate.

###### Returns

`ResourceTemplate`

###### Parameters

**`mcp_template`** : `mcp_types.ResourceTemplate`

The MCP SDK ResourceTemplate object.

### ServerCapabilities

Capabilities that an MCP server supports.

#### Attributes

##### experimental

Experimental, non-standard capabilities that the server supports.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### logging

Whether the server supports sending log messages to the client.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### prompts

Whether the server offers any prompt templates.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### prompts\_list\_changed

Whether the server will emit notifications when the list of prompts changes.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### resources

Whether the server offers any resources to read.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### resources\_list\_changed

Whether the server will emit notifications when the list of resources changes.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### tools

Whether the server offers any tools to call.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### tools\_list\_changed

Whether the server will emit notifications when the list of tools changes.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### completions

Whether the server offers autocompletion suggestions for prompts and resources.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

#### Methods

##### from\_mcp\_sdk

`@classmethod`

```python
def from_mcp_sdk(
    cls,
    mcp_capabilities: mcp_types.ServerCapabilities,
) -> ServerCapabilities
```

Convert from MCP SDK ServerCapabilities to PydanticAI ServerCapabilities.

###### Returns

`ServerCapabilities`

###### Parameters

**`mcp_capabilities`** : `mcp_types.ServerCapabilities`

The MCP SDK ServerCapabilities object.

### MCPServer

**Bases:** `AbstractToolset[Any]`, `ABC`

Base class for attaching agents to MCP servers.

See [https://modelcontextprotocol.io](https://modelcontextprotocol.io) for more information.

#### Attributes

##### tool\_prefix

A prefix to add to all tools that are registered with the server.

If not empty, will include a trailing underscore(`_`).

e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar`

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_prefix`

##### log\_level

The log level to set when connecting to the server, if any.

See [https://modelcontextprotocol.io/specification/2025-03-26/server/utilities/logging#logging](https://modelcontextprotocol.io/specification/2025-03-26/server/utilities/logging#logging) for more details.

If `None`, no log level will be set.

**Type:** `mcp_types.LoggingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `log_level`

##### log\_handler

A handler for logging messages from the server.

**Type:** `LoggingFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `log_handler`

##### timeout

The timeout in seconds to wait for the client to initialize.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) **Default:** `timeout`

##### read\_timeout

Maximum time in seconds to wait for new messages before timing out.

This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds).

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) **Default:** `read_timeout`

##### process\_tool\_call

Hook to customize tool calling and optionally pass extra metadata.

**Type:** `ProcessToolCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `process_tool_call`

##### allow\_sampling

Whether to allow MCP sampling through this client.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `allow_sampling`

##### sampling\_model

The model to use for sampling.

**Type:** [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `sampling_model`

##### max\_retries

The maximum number of times to retry a tool call.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `max_retries`

##### elicitation\_callback

Callback function to handle elicitation requests from the server.

**Type:** `ElicitationFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `elicitation_callback`

##### cache\_tools

Whether to cache the list of tools.

When enabled (default), tools are fetched once and cached until either:

-   The server sends a `notifications/tools/list_changed` notification
-   `MCPServer.__aexit__` is called (when the last context exits)

Set to `False` for servers that change tools dynamically without sending notifications.

Note: When using durable execution (Temporal, DBOS), tool definitions are additionally cached at the wrapper level across activities/steps, to avoid redundant MCP connections. This wrapper-level cache is not invalidated by `tools/list_changed` notifications. Set to `False` to disable all caching if tools may change during a workflow.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `cache_tools`

##### cache\_resources

Whether to cache the list of resources.

When enabled (default), resources are fetched once and cached until either:

-   The server sends a `notifications/resources/list_changed` notification
-   `MCPServer.__aexit__` is called (when the last context exits)

Set to `False` for servers that change resources dynamically without sending notifications.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `cache_resources`

##### include\_instructions

Whether to include the server's instructions in the agent's instructions.

Defaults to `False` for backward compatibility.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `include_instructions`

##### include\_return\_schema

Whether to include return schemas in tool definitions sent to the model.

When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `include_return_schema`

##### server\_info

Access the information send by the MCP server during initialization.

**Type:** `mcp_types.Implementation`

##### capabilities

Access the capabilities advertised by the MCP server during initialization.

**Type:** `ServerCapabilities`

##### instructions

Access the instructions sent by the MCP server during initialization.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### is\_running

Check if the MCP server is running.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### client\_streams

`@abstractmethod`

`@async`

```python
def client_streams(

) -> AsyncIterator[tuple[MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage]]]
```

Create the streams for the MCP server.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`MemoryObjectReceiveStream`\[`SessionMessage` | [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)\], `MemoryObjectSendStream`\[`SessionMessage`\]\]\]

##### get\_instructions

`@async`

```python
def get_instructions(ctx: RunContext[Any]) -> messages.InstructionPart | None
```

Return the MCP server's instructions for how to use its tools.

If [`include_instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_instructions) is `True`, returns the [`instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.instructions) sent by the MCP server during initialization. Otherwise, returns `None`.

Instructions from external servers are marked as dynamic since they may change between connections.

###### Returns

[`messages.InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- An `InstructionPart` with the server's instructions if `include_instructions` is enabled, otherwise `None`.

###### Parameters

**`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The run context for this agent run.

##### list\_tools

`@async`

```python
def list_tools() -> list[mcp_types.Tool]
```

Retrieve tools that are currently active on the server.

Tools are cached by default, with cache invalidation on:

-   `notifications/tools/list_changed` notifications from the server
-   `__aexit__` when the last context exits

Set `cache_tools=False` for servers that change tools without sending notifications.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`mcp_types.Tool`\]

##### direct\_call\_tool

`@async`

```python
def direct_call_tool(
    name: str,
    args: dict[str, Any],
    metadata: dict[str, Any] | None = None,
) -> ToolResult
```

Call a tool on the server.

###### Returns

`ToolResult` -- The result of the tool call.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the tool to call.

**`args`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The arguments to pass to the tool.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Request-level metadata (optional)

###### Raises

-   `ModelRetry` -- If the tool call fails.

##### list\_resources

`@async`

```python
def list_resources() -> list[Resource]
```

Retrieve resources that are currently present on the server.

Resources are cached by default, with cache invalidation on:

-   `notifications/resources/list_changed` notifications from the server
-   `__aexit__` when the last context exits

Set `cache_resources=False` for servers that change resources without sending notifications.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Resource`\]

###### Raises

-   `MCPError` -- If the server returns an error.

##### list\_resource\_templates

`@async`

```python
def list_resource_templates() -> list[ResourceTemplate]
```

Retrieve resource templates that are currently present on the server.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ResourceTemplate`\]

###### Raises

-   `MCPError` -- If the server returns an error.

##### read\_resource

`@async`

```python
def read_resource(
    uri: str,
) -> str | messages.BinaryContent | list[str | messages.BinaryContent]
def read_resource(
    uri: Resource,
) -> str | messages.BinaryContent | list[str | messages.BinaryContent]
```

Read the contents of a specific resource by URI.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\] -- The resource contents. If the resource has a single content item, returns that item directly. [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`messages.BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\] -- If the resource has multiple content items, returns a list of items.

###### Parameters

**`uri`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Resource`

The URI of the resource to read, or a Resource object.

###### Raises

-   `MCPError` -- If the server returns an error.

##### \_\_aenter\_\_

`@async`

```python
def __aenter__() -> Self
```

Enter the MCP server context.

The first call starts the connection (spawning a subprocess for stdio servers, opening an HTTP connection for HTTP servers). Subsequent calls -- from any task -- share the same connection via reference counting. The connection is torn down when the last `async with` scope exits.

Because the session runs in a dedicated background task, entering and exiting from different tasks (e.g. `asyncio.gather` children, fasta2a workers, or graph node tasks) is safe: the underlying transport's cancel scopes never cross task boundaries.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

### MCPServerStdio

**Bases:** `MCPServer`

Runs an MCP server in a subprocess and communicates with it over stdin/stdout.

This class implements the stdio transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) for more information.

Note

Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context.

Example:

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

server = MCPServerStdio(  # (1)
  'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10
)
agent = Agent('openai:gpt-5.2', toolsets=[server])
```

See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information.

#### Attributes

##### command

The command to run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `command`

##### args

The arguments to pass to the command.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `args`

##### env

The environment variables the CLI server will have access to.

By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `env`

##### cwd

The working directory to use when spawning the process.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `cwd`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    command: str,
    args: Sequence[str],
    env: dict[str, str] | None = None,
    cwd: str | Path | None = None,
    tool_prefix: str | None = None,
    log_level: mcp_types.LoggingLevel | None = None,
    log_handler: LoggingFnT | None = None,
    timeout: float = 5,
    read_timeout: float = 5 * 60,
    process_tool_call: ProcessToolCallback | None = None,
    allow_sampling: bool = True,
    sampling_model: models.Model | None = None,
    max_retries: int = 1,
    elicitation_callback: ElicitationFnT | None = None,
    cache_tools: bool = True,
    cache_resources: bool = True,
    include_instructions: bool = False,
    include_return_schema: bool | None = None,
    id: str | None = None,
    client_info: mcp_types.Implementation | None = None,
)
```

Build a new MCP server.

###### Parameters

**`command`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The command to run.

**`args`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

The arguments to pass to the command.

**`env`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The environment variables to set in the subprocess.

**`cwd`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The working directory to use when spawning the process.

**`tool_prefix`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A prefix to add to all tools that are registered with the server.

**`log_level`** : `mcp_types.LoggingLevel` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The log level to set when connecting to the server, if any.

**`log_handler`** : `LoggingFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A handler for logging messages from the server.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `5`

The timeout in seconds to wait for the client to initialize.

**`read_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `5 * 60`

Maximum time in seconds to wait for new messages before timing out.

**`process_tool_call`** : `ProcessToolCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Hook to customize tool calling and optionally pass extra metadata.

**`allow_sampling`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to allow MCP sampling through this client.

**`sampling_model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The model to use for sampling.

**`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1`

The maximum number of times to retry a tool call.

**`elicitation_callback`** : `ElicitationFnT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Callback function to handle elicitation requests from the server.

**`cache_tools`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to cache the list of tools. See [`MCPServer.cache_tools`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.cache_tools).

**`cache_resources`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to cache the list of resources. See [`MCPServer.cache_resources`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.cache_resources).

**`include_instructions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to include the server's instructions in the agent's instructions. See [`MCPServer.include_instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_instructions).

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include return schemas in tool definitions. See [`MCPServer.include_return_schema`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.include_return_schema).

**`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow.

**`client_info`** : `mcp_types.Implementation` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Information describing the MCP client implementation.

### MCPServerSSE

**Bases:** `_MCPServerHTTP`

An MCP server that connects over streamable HTTP connections.

This class implements the SSE transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) for more information.

Note

Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running.

Example:

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

server = MCPServerSSE('http://localhost:3001/sse')
agent = Agent('openai:gpt-5.2', toolsets=[server])
```

### MCPServerHTTP

**Bases:** `MCPServerSSE`

An MCP server that connects over HTTP using the old SSE transport.

This class implements the SSE transport from the MCP specification. See [https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) for more information.

Note

Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running.

Example:

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerHTTP

server = MCPServerHTTP('http://localhost:3001/sse')
agent = Agent('openai:gpt-5.2', toolsets=[server])
```

### MCPServerStreamableHTTP

**Bases:** `_MCPServerHTTP`

An MCP server that connects over HTTP using the Streamable HTTP transport.

This class implements the Streamable HTTP transport from the MCP specification. See [https://modelcontextprotocol.io/introduction#streamable-http](https://modelcontextprotocol.io/introduction#streamable-http) for more information.

Note

Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running.

Example:

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStreamableHTTP

server = MCPServerStreamableHTTP('http://localhost:8000/mcp')
agent = Agent('openai:gpt-5.2', toolsets=[server])
```

### MCPServerConfig

**Bases:** `BaseModel`

Configuration for MCP servers.

### load\_mcp\_servers

```python
def load_mcp_servers(
    config_path: str | Path,
) -> list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]
```

Load MCP servers from a configuration file.

Environment variables can be referenced in the configuration file using:

-   `${VAR_NAME}` syntax - expands to the value of VAR\_NAME, raises error if not defined
-   `${VAR_NAME:-default}` syntax - expands to VAR\_NAME if set, otherwise uses the default value

#### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`MCPServerStdio` | `MCPServerStreamableHTTP` | `MCPServerSSE`\] -- A list of MCP servers.

#### Parameters

**`config_path`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `Path`

The path to the configuration file.

#### Raises

-   `FileNotFoundError` -- If the configuration file does not exist.
-   `ValidationError` -- If the configuration file does not match the schema.
-   `ValueError` -- If an environment variable referenced in the configuration is not defined and no default value is provided.

### ToolResult

The result type of an MCP tool call.

**Default:** `str | messages.BinaryContent | dict[str, Any] | list[Any] | Sequence[str | messages.BinaryContent | dict[str, Any] | list[Any]]`

### CallToolFunc

A function type that represents a tool call.

**Default:** `Callable[[str, dict[str, Any], dict[str, Any] | None], Awaitable[ToolResult]]`

### ProcessToolCallback

A process tool callback.

It accepts a run context, the original tool call function, a tool name, and arguments.

Allows wrapping an MCP server tool call to customize it, including adding extra request metadata.

**Default:** `Callable[[RunContext[Any], CallToolFunc, str, dict[str, Any]], Awaitable[ToolResult]]`

---

# [pydantic_ai.messages](https://pydantic.dev/docs/ai/api/pydantic-ai/messages/)

# pydantic\_ai.messages

The structure of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage) can be shown as a graph:

```mermaid
graph RL
    SystemPromptPart(SystemPromptPart) --- ModelRequestPart
    UserPromptPart(UserPromptPart) --- ModelRequestPart
    ToolReturnPart(ToolReturnPart) --- ModelRequestPart
    RetryPromptPart(RetryPromptPart) --- ModelRequestPart
    TextPart(TextPart) --- ModelResponsePart
    ToolCallPart(ToolCallPart) --- ModelResponsePart
    ThinkingPart(ThinkingPart) --- ModelResponsePart
    ModelRequestPart("ModelRequestPart<br>(Union)") --- ModelRequest
    ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage
    ModelResponsePart("ModelResponsePart<br>(Union)") --- ModelResponse
    ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage<br>(Union)")
```

### SystemPromptPart

A system prompt, generally written by the application developer.

This gives the model context and guidance on how to respond.

#### Attributes

##### content

The content of the prompt.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### timestamp

The timestamp of the prompt.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### dynamic\_ref

The ref of the dynamic system prompt function that generated this part.

Only set if system prompt is dynamic, see [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt) for more information.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system-prompt'\] **Default:** `'system-prompt'`

### FileUrl

**Bases:** `ABC`

Abstract base class for any URL-based file.

#### Attributes

##### url

The URL of the file.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### force\_download

Controls whether the file is downloaded and how SSRF protection is applied:

-   If `False`, the URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata).
-   If `True`, the file is always downloaded with SSRF protection (blocks private IPs and cloud metadata).
-   If `'allow-local'`, the file is always downloaded, allowing private IPs but still blocking cloud metadata.

**Type:** `ForceDownloadMode` **Default:** `False`

##### vendor\_metadata

Vendor-specific metadata for the file.

Supported by:

-   `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: [https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing](https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing)
-   `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images
-   `XaiModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### media\_type

Return the media type of the file, based on the URL or the provided `media_type`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### identifier

The identifier of the file, such as a unique ID.

This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`.

This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file <identifier>:" preceding the `FileUrl`.

It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### format

The file format.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### VideoUrl

**Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl)

A URL to a video.

#### Attributes

##### url

The URL of the video.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['video-url'\] **Default:** `'video-url'`

##### is\_youtube

True if the URL has a YouTube domain.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### format

The file format of the video.

The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format.

**Type:** `VideoFormat`

### AudioUrl

**Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl)

A URL to an audio file.

#### Attributes

##### url

The URL of the audio file.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['audio-url'\] **Default:** `'audio-url'`

##### format

The file format of the audio file.

**Type:** `AudioFormat`

### ImageUrl

**Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl)

A URL to an image.

#### Attributes

##### url

The URL of the image.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['image-url'\] **Default:** `'image-url'`

##### format

The file format of the image.

The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format.

**Type:** `ImageFormat`

### DocumentUrl

**Bases:** [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl)

The URL of the document.

#### Attributes

##### url

The URL of the document.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['document-url'\] **Default:** `'document-url'`

##### format

The file format of the document.

The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format.

**Type:** `DocumentFormat`

### TextContent

String content that is tagged with additional metadata.

This is useful for including metadata that can be accessed programmatically by the application, but is not sent to the LLM.

#### Attributes

##### content

The content that is sent to the LLM.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### metadata

Additional data that can be accessed programmatically by the application but is not sent to the LLM.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None`

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text-content'\] **Default:** `'text-content'`

### BinaryContent

Binary content, e.g. an audio or image file.

#### Attributes

##### data

The binary file data.

Use `.base64` to get the base64-encoded string.

**Type:** [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes)

##### media\_type

The media type of the binary data.

**Type:** `AudioMediaType` | `ImageMediaType` | `DocumentMediaType` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### vendor\_metadata

Vendor-specific metadata for the file.

Supported by:

-   `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: [https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing](https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing)
-   `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images
-   `XaiModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['binary'\] **Default:** `'binary'`

##### identifier

Identifier for the binary content, such as a unique ID.

This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`.

This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file <identifier>:" preceding the `BinaryContent`.

It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### data\_uri

Convert the `BinaryContent` to a data URI.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### base64

Return the binary data as a base64-encoded string. Default encoding is UTF-8.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### is\_audio

Return `True` if the media type is an audio type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### is\_image

Return `True` if the media type is an image type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### is\_video

Return `True` if the media type is a video type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### is\_document

Return `True` if the media type is a document type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### format

The file format of the binary content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### narrow\_type

`@staticmethod`

```python
def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage
```

Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image.

###### Returns

[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent) | [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage)

##### from\_data\_uri

`@classmethod`

```python
def from_data_uri(cls, data_uri: str) -> BinaryContent
```

Create a `BinaryContent` from a data URI.

###### Returns

[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)

##### from\_path

`@classmethod`

```python
def from_path(cls, path: PathLike[str]) -> BinaryContent
```

Create a `BinaryContent` from a path.

Defaults to 'application/octet-stream' if the media type cannot be inferred.

###### Returns

[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)

###### Raises

-   `FileNotFoundError` -- if the file does not exist.
-   `PermissionError` -- if the file cannot be read.

### BinaryImage

**Bases:** [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)

Binary content that's guaranteed to be an image.

### CachePoint

A cache point marker for prompt caching.

Can be inserted into UserPromptPart.content to mark cache boundaries. Models that don't support caching will filter these out.

Supported by:

-   Anthropic
-   Amazon Bedrock (Converse API)

#### Attributes

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['cache-point'\] **Default:** `'cache-point'`

##### ttl

The cache time-to-live, either "5m" (5 minutes) or "1h" (1 hour).

Supported by:

-   Anthropic -- see [https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration](https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration) for more information.
-   Amazon Bedrock (Converse API) -- see [https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) for more information.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['5m', '1h'\] **Default:** `'5m'`

### UploadedFile

A reference to a file uploaded to a provider's file storage by ID.

This allows referencing files that have been uploaded via provider-specific file APIs rather than providing the file content directly.

Supported by:

-   [`AnthropicModel`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModel)
-   [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel)
-   [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel)
-   [`BedrockConverseModel`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockConverseModel)
-   [`GoogleModel`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModel) (GLA: [Files API](https://ai.google.dev/gemini-api/docs/files) URIs, Vertex: GCS `gs://` URIs)
-   [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel)

#### Attributes

##### file\_id

The provider-specific file identifier.

For most providers, this is the file ID returned by the provider's upload API. For GoogleModel (Vertex), this must be a GCS URI (`gs://bucket/path`). For GoogleModel (GLA), this must be a Google Files API URI (`https://generativelanguage.googleapis.com/...`). For BedrockConverseModel, this must be an S3 URI (`s3://bucket/key`).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

The provider this file belongs to.

This is required because file IDs are not portable across providers, and using a file ID with the wrong provider will always result in an error.

Tip: Use `model.system` to get the provider name dynamically.

**Type:** `UploadedFileProviderName`

##### vendor\_metadata

Vendor-specific metadata for the file.

The expected shape of this dictionary depends on the provider:

Supported by:

-   `GoogleModel`: used as `video_metadata` for video files

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

Type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['uploaded-file'\] **Default:** `'uploaded-file'`

##### media\_type

Return the media type of the file, inferred from `file_id` if not explicitly provided.

Note: Inference relies on the file extension in `file_id`. For opaque file IDs (e.g., `'file-abc123'`), the media type will default to `'application/octet-stream'`. Inference relies on Python's `mimetypes` module, whose results may vary across platforms.

Required by some providers (e.g., Bedrock) for certain file types.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### identifier

The identifier of the file, such as a unique ID.

This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `UploadedFile`.

This identifier is only automatically passed to the model when the `UploadedFile` is returned by a tool. If you're passing the `UploadedFile` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file <identifier>:" preceding the `UploadedFile`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### format

A general-purpose media-type-to-format mapping.

Maps media types to format strings (e.g. `'image/png'` -> `'png'`). Covers image, video, audio, and document types. Currently used by Bedrock, which requires explicit format strings.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### ToolReturn

**Bases:** `Generic[_ToolReturnValueT]`

A structured tool return that separates the tool result from additional content sent to the model.

Can be parameterized with a type to enable return schema generation:

-   `ToolReturn[User]` -- generates a return schema for `User`
-   `ToolReturn` (bare) -- no return schema generated

#### Attributes

##### return\_value

The return value to be used in the tool response.

**Type:** `ToolReturnContent`

##### content

Content sent to the model as a separate `UserPromptPart`.

Use this when you want content to appear outside the tool result message. For multimodal content that should be sent natively in the tool result, return it directly from the tool function or include it in `return_value`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### metadata

Additional data accessible by the application but not sent to the LLM.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None`

### UserPromptPart

A user prompt, generally written by the end user.

Content comes from the `user_prompt` parameter of [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), and [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream).

#### Attributes

##### content

The content of the prompt.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\]

##### timestamp

The timestamp of the prompt.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['user-prompt'\] **Default:** `'user-prompt'`

### BaseToolReturnPart

Base class for tool return parts.

#### Attributes

##### tool\_name

The name of the tool that was called.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### content

The tool return content, which may include multimodal files.

**Type:** `ToolReturnContent`

##### tool\_call\_id

The tool call identifier, this is used by some models including OpenAI.

In case the tool call id is not provided by the model, Pydantic AI will generate a random one.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)`

##### metadata

Additional data accessible by the application but not sent to the LLM.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None`

##### timestamp

The timestamp, when the tool returned.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### outcome

The outcome of the tool call.

-   `'success'`: The tool executed successfully.
-   `'failed'`: The tool raised an error during execution.
-   `'denied'`: The tool call was denied by the approval mechanism.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['success', 'failed', 'denied'\] **Default:** `'success'`

##### files

The multimodal file parts from `content` (`ImageUrl`, `AudioUrl`, `DocumentUrl`, `VideoUrl`, `BinaryContent`).

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\]

#### Methods

##### content\_items

```python
def content_items(mode: Literal['raw'] = 'raw') -> list[ToolReturnContent]
def content_items(mode: Literal['str']) -> list[str | MultiModalContent]
def content_items(mode: Literal['jsonable']) -> list[Any | MultiModalContent]
```

Return content as a flat list for iteration, with optional serialization.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ToolReturnContent`\] | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\] | [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\]

###### Parameters

**`mode`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['raw', 'str', 'jsonable'\] _Default:_ `'raw'`

Controls serialization of non-file items:

-   `'raw'`: No serialization. Returns items as-is.
-   `'str'`: Non-file items are serialized to strings via `tool_return_ta`. File items (`MultiModalContent`) pass through unchanged.
-   `'jsonable'`: Non-file items are serialized to JSON-compatible Python objects via `tool_return_ta`. File items pass through unchanged.

##### model\_response\_str

```python
def model_response_str() -> str
```

Return a string representation of the data content for the model.

This excludes multimodal files - use `.files` to get those separately.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### model\_response\_object

```python
def model_response_object() -> dict[str, Any]
```

Return a dictionary representation of the data content, wrapping non-dict types appropriately.

This excludes multimodal files - use `.files` to get those separately. Gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### model\_response\_str\_and\_user\_content

```python
def model_response_str_and_user_content() -> tuple[str, list[UserContent]]
```

Build a text-only tool result with multimodal files extracted for a trailing user message.

For providers whose tool result API only accepts text. Multimodal files are referenced by identifier in the tool result text ('See file {id}.') and included in full in the returned file content list ('This is file {id}:' followed by the file).

###### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`list`](https://docs.python.org/3/glossary.html#term-list)\[`UserContent`\]\]

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the tool return has content.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### ToolReturnPart

**Bases:** [`BaseToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolReturnPart)

A tool return message, this encodes the result of running a tool.

#### Attributes

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool-return'\] **Default:** `'tool-return'`

### BuiltinToolReturnPart

**Bases:** [`BaseToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolReturnPart)

A tool return message from a built-in tool.

#### Attributes

##### provider\_name

The name of the provider that generated the response.

Required to be set when `provider_details` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin-tool-return'\] **Default:** `'builtin-tool-return'`

### RetryPromptPart

A message back to a model asking it to try again.

This can be sent for a number of reasons:

-   Pydantic validation of tool arguments failed, here content is derived from a Pydantic [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError)
-   a tool raised a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception
-   no tool was found for the tool name
-   the model returned plain text when a structured response was expected
-   Pydantic validation of a structured response failed, here content is derived from a Pydantic [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError)
-   an output validator raised a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception

#### Attributes

##### content

Details of why and how the model should retry.

If the retry was triggered by a [`ValidationError`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ValidationError), this will be a list of error details.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`pydantic_core.ErrorDetails`](https://docs.pydantic.dev/latest/api/pydantic-core/pydantic_core/#pydantic_core.ErrorDetails)\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### tool\_name

The name of the tool that was called, if any.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_call\_id

The tool call identifier, this is used by some models including OpenAI.

In case the tool call id is not provided by the model, Pydantic AI will generate a random one.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)`

##### timestamp

The timestamp, when the retry was triggered.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['retry-prompt'\] **Default:** `'retry-prompt'`

#### Methods

##### model\_response

```python
def model_response() -> str
```

Return a string message describing why the retry is requested.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### InstructionPart

A single instruction block with metadata about its origin.

Instructions are composed of one or more parts, each of which can be static (from a literal string) or dynamic (from a function, template, or toolset). This distinction allows model implementations to make intelligent caching decisions -- e.g. Anthropic's prompt caching can cache the static prefix while leaving dynamic instructions uncached.

#### Attributes

##### content

The text content of this instruction block.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### dynamic

Whether this instruction came from a dynamic source (function, template, or toolset).

Static instructions (`dynamic=False`) come from literal strings passed to `Agent(instructions=...)`. Dynamic instructions (`dynamic=True`) come from `@agent.instructions` functions, `TemplateStr`, or toolset `get_instructions()` methods.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### part\_kind

Part type identifier, used as a discriminator for deserialization.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['instruction'\] **Default:** `'instruction'`

#### Methods

##### join

`@staticmethod`

```python
def join(parts: Sequence[InstructionPart]) -> str | None
```

Join instruction parts into a single string, separated by double newlines.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### sorted

`@staticmethod`

```python
def sorted(parts: Sequence[InstructionPart]) -> list[InstructionPart]
```

Sort instruction parts with static (`dynamic=False`) before dynamic, preserving relative order.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\]

### ModelRequest

A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model.

#### Attributes

##### parts

The parts of the user message.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelRequestPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequestPart)\]

##### timestamp

The timestamp when the request was sent to the model.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### instructions

The instructions string for this request, rendered from structured instruction parts.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### kind

Message type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['request'\] **Default:** `'request'`

##### run\_id

The unique identifier of the agent run in which this message originated.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### conversation\_id

The unique identifier of the conversation this message belongs to.

A conversation spans potentially multiple agent runs that share message history. Emitted as the `gen_ai.conversation.id` OpenTelemetry span attribute on the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### metadata

Additional data that can be accessed programmatically by the application but is not sent to the LLM.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### user\_text\_prompt

`@classmethod`

```python
def user_text_prompt(
    cls,
    user_prompt: str,
    instructions: str | None = None,
) -> ModelRequest
```

Create a `ModelRequest` with a single user prompt as text.

###### Returns

[`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest)

### TextPart

A plain text response from a model.

#### Attributes

##### content

The text content of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### id

An optional identifier of the text part.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the response.

Required to be set when `provider_details` or `id` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\] **Default:** `'text'`

#### Methods

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the text content is non-empty.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### ThinkingPart

A thinking response from a model.

#### Attributes

##### content

The thinking content of the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### id

The identifier of the thinking part.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### signature

The signature of the thinking.

Supported by:

-   Anthropic (corresponds to the `signature` field)
-   Bedrock (corresponds to the `signature` field)
-   Google (corresponds to the `thought_signature` field)
-   OpenAI (corresponds to the `encrypted_content` field)

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the response.

Signatures are only sent back to the same provider. Required to be set when `provider_details`, `id` or `signature` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['thinking'\] **Default:** `'thinking'`

#### Methods

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the thinking content is non-empty.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### CompactionPart

A compaction part that summarizes previous conversation history.

Compaction parts contain an opaque or readable summary of prior messages, produced by provider-specific compaction mechanisms. They must be round-tripped back to the same provider in subsequent requests.

For Anthropic, `content` contains a readable text summary. For OpenAI, `content` is `None` and the encrypted data is stored in `provider_details`.

#### Attributes

##### content

The compaction summary text, if available.

For Anthropic: a readable text summary of compacted messages. For OpenAI: `None` (the compacted content is encrypted and stored in `provider_details`).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### id

The identifier of the compaction part.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the compaction.

Compaction data is only sent back to the same provider. Required to be set when `provider_details` or `id` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

For OpenAI: contains `encrypted_content` and other fields from `ResponseCompactionItem`. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['compaction'\] **Default:** `'compaction'`

#### Methods

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the compaction content is non-empty.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### FilePart

A file response from a model.

#### Attributes

##### content

The file content of the response.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent), [`pydantic.AfterValidator`](https://docs.pydantic.dev/latest/api/pydantic/functional_validators/#pydantic.functional_validators.AfterValidator)(`BinaryImage.narrow_type`)\]

##### id

The identifier of the file part.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the response.

Required to be set when `provider_details` or `id` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['file'\] **Default:** `'file'`

#### Methods

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the file content is non-empty.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### BaseToolCallPart

A tool call from a model.

#### Attributes

##### tool\_name

The name of the tool to call.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### args

The arguments to pass to the tool.

This is stored either as a JSON string or a Python dictionary depending on how data was received.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_call\_id

The tool call identifier, this is used by some models including OpenAI.

In case the tool call id is not provided by the model, Pydantic AI will generate a random one.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=_generate_tool_call_id)`

##### id

An optional identifier of the tool call part, separate from the tool call ID.

This is used by some APIs like OpenAI Responses. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the response.

Builtin tool calls are only sent back to the same provider. Required to be set when `provider_details` or `id` is set.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### args\_as\_dict

```python
def args_as_dict(raise_if_invalid: bool = False) -> dict[str, Any]
```

Return the arguments as a Python dictionary.

This is just for convenience with models that require dicts as input.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

###### Parameters

**`raise_if_invalid`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If `True`, a `ValueError` or `AssertionError` caused by malformed JSON in `args` will be re-raised. When `False` (the default), malformed JSON is handled gracefully by returning `{'INVALID_JSON': '<raw args>'}` so that the value can still be sent to a model API (e.g. during a retry flow) without crashing.

##### args\_as\_json\_str

```python
def args_as_json_str() -> str
```

Return the arguments as a JSON string.

This is just for convenience with models that require JSON strings as input.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### has\_content

```python
def has_content() -> bool
```

Return `True` if the tool call has content.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### ToolCallPart

**Bases:** [`BaseToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolCallPart)

A tool call from a model.

#### Attributes

##### part\_kind

Part type identifier, this is available on all parts as a discriminator. Note that this is different from `ToolCallPartDelta.part_delta_kind`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool-call'\] **Default:** `'tool-call'`

### BuiltinToolCallPart

**Bases:** [`BaseToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BaseToolCallPart)

A tool call to a built-in tool.

#### Attributes

##### part\_kind

Part type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin-tool-call'\] **Default:** `'builtin-tool-call'`

### ModelResponse

A response from a model, e.g. a message from the model to the Pydantic AI app.

#### Attributes

##### parts

The parts of the model message.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart)\]

##### usage

Usage information for the request.

This has a default to make tests easier, and to support loading old messages where usage will be missing.

**Type:** [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage) **Default:** `field(default_factory=RequestUsage)`

##### model\_name

The name of the model that generated the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### timestamp

The timestamp when the response was received locally.

This is always a high-precision local datetime. Provider-specific timestamps (if available) are stored in `provider_details['timestamp']`.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=_now_utc)`

##### kind

Message type identifier, this is available on all parts as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['response'\] **Default:** `'response'`

##### provider\_name

The name of the LLM provider that generated the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_url

The base URL of the LLM provider that generated the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None), [`pydantic.Field`](https://docs.pydantic.dev/latest/api/pydantic/fields/#pydantic.fields.Field)(`validation_alias`\=([`pydantic.AliasChoices`](https://docs.pydantic.dev/latest/api/pydantic/aliases/#pydantic.aliases.AliasChoices)(`provider_details`, `vendor_details`)))\] **Default:** `None`

##### provider\_response\_id

request ID as specified by the model provider. This can be used to track the specific request to the model.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None), [`pydantic.Field`](https://docs.pydantic.dev/latest/api/pydantic/fields/#pydantic.fields.Field)(`validation_alias`\=([`pydantic.AliasChoices`](https://docs.pydantic.dev/latest/api/pydantic/aliases/#pydantic.aliases.AliasChoices)(`provider_response_id`, `vendor_id`)))\] **Default:** `None`

##### finish\_reason

Reason the model finished generating the response, normalized to OpenTelemetry values.

**Type:** [`FinishReason`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinishReason) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### run\_id

The unique identifier of the agent run in which this message originated.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### conversation\_id

The unique identifier of the conversation this message belongs to.

A conversation spans potentially multiple agent runs that share message history. Emitted as the `gen_ai.conversation.id` OpenTelemetry span attribute on the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### metadata

Additional data that can be accessed programmatically by the application but is not sent to the LLM.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### state

Lifecycle state of the response.

**Type:** [`ModelResponseState`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponseState) **Default:** `'complete'`

##### text

Get the text in the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### thinking

Get the thinking in the response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### files

Get the files in the response.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)\]

##### images

Get the images in the response.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage)\]

##### tool\_calls

Get the tool calls in the response.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\]

##### builtin\_tool\_calls

Get the builtin tool calls and results in the response.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart), [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart)\]\]

#### Methods

##### price

`@deprecated`

```python
def price() -> genai_types.PriceCalculation
```

###### Returns

`genai_types.PriceCalculation`

##### cost

```python
def cost() -> genai_types.PriceCalculation
```

Calculate the cost of the usage.

Uses [`genai-prices`](https://github.com/pydantic/genai-prices).

###### Returns

`genai_types.PriceCalculation`

##### otel\_events

```python
def otel_events(settings: InstrumentationSettings) -> list[LogRecord]
```

Return OpenTelemetry events for the response.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`LogRecord`\]

### TextPartDelta

A partial update (delta) for a `TextPart` to append new text content.

#### Attributes

##### content\_delta

The incremental text content to add to the existing `TextPart` content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_name

The name of the provider that generated the response.

This is required to be set when `provider_details` is set and the initial TextPart does not have a `provider_name` or it has changed.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_delta\_kind

Part delta type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text'\] **Default:** `'text'`

#### Methods

##### apply

```python
def apply(part: ModelResponsePart) -> TextPart
```

Apply this text delta to an existing `TextPart`.

###### Returns

[`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -- A new `TextPart` with updated text content.

###### Parameters

**`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart)

The existing model response part, which must be a `TextPart`.

###### Raises

-   `ValueError` -- If `part` is not a `TextPart`.

### ThinkingPartDelta

A partial update (delta) for a `ThinkingPart` to append new thinking content.

#### Attributes

##### content\_delta

The incremental thinking content to add to the existing `ThinkingPart` content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### signature\_delta

Optional signature delta.

Note this is never treated as a delta -- it can replace None.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

Optional provider name for the thinking part.

Signatures are only sent back to the same provider. Required to be set when `provider_details` is set and the initial ThinkingPart does not have a `provider_name` or it has changed.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

Can be a dict to merge with existing details, or a callable that takes the existing details and returns updated details.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** `ProviderDetailsDelta` **Default:** `None`

##### part\_delta\_kind

Part delta type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['thinking'\] **Default:** `'thinking'`

#### Methods

##### apply

```python
def apply(part: ModelResponsePart) -> ThinkingPart
def apply(
    part: ModelResponsePart | ThinkingPartDelta,
) -> ThinkingPart | ThinkingPartDelta
```

Apply this thinking delta to an existing `ThinkingPart`.

###### Returns

[`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) | [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) -- A new `ThinkingPart` with updated thinking content.

###### Parameters

**`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) | [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta)

The existing model response part, which must be a `ThinkingPart`.

###### Raises

-   `ValueError` -- If `part` is not a `ThinkingPart`.

### ToolCallPartDelta

A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID.

#### Attributes

##### tool\_name\_delta

Incremental text to add to the existing tool name, if any.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### args\_delta

Incremental data to add to the tool arguments.

If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_call\_id

Optional tool call identifier, this is used by some models including OpenAI.

Note this is never treated as a delta -- it can replace None, but otherwise if a non-matching value is provided an error will be raised.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_name

The name of the provider that generated the response.

This is required to be set when `provider_details` is set and the initial ToolCallPart does not have a `provider_name` or it has changed.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_details

Additional data returned by the provider that can't be mapped to standard fields.

This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically.

When this field is set, `provider_name` is required to identify the provider that generated this data.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### part\_delta\_kind

Part delta type identifier, used as a discriminator. Note that this is different from `ToolCallPart.part_kind`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['tool\_call'\] **Default:** `'tool_call'`

#### Methods

##### as\_part

```python
def as_part() -> ToolCallPart | None
```

Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`.

###### Returns

[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`.

##### apply

```python
def apply(part: ModelResponsePart) -> ToolCallPart | BuiltinToolCallPart
def apply(
    part: ModelResponsePart | ToolCallPartDelta,
) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta
```

Apply this delta to a part or delta, returning a new part or delta with the changes applied.

###### Returns

[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) | [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) | [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) -- Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`.

###### Parameters

**`part`** : [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart) | [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta)

The existing model response part or delta to update.

###### Raises

-   `ValueError` -- If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`.
-   `UnexpectedModelBehavior` -- If applying JSON deltas to dict arguments or vice versa.

### PartStartEvent

An event indicating that a new part has started.

If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one.

#### Attributes

##### index

The index of the part within the overall response parts list.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### part

The newly started `ModelResponsePart`.

**Type:** [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart)

##### previous\_part\_kind

The kind of the previous part, if any.

This is useful for UI event streams to know whether to group parts of the same kind together when emitting events.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'compaction', 'file'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_start'\] **Default:** `'part_start'`

### PartDeltaEvent

An event indicating a delta update for an existing part.

#### Attributes

##### index

The index of the part within the overall response parts list.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### delta

The delta to apply to the specified part.

**Type:** [`ModelResponsePartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePartDelta)

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_delta'\] **Default:** `'part_delta'`

### PartEndEvent

An event indicating that a part is complete.

#### Attributes

##### index

The index of the part within the overall response parts list.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### part

The complete `ModelResponsePart`.

**Type:** [`ModelResponsePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponsePart)

##### next\_part\_kind

The kind of the next part, if any.

This is useful for UI event streams to know whether to group parts of the same kind together when emitting events.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'compaction', 'file'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['part\_end'\] **Default:** `'part_end'`

### FinalResultEvent

An event indicating the response to the current model request matches the output schema and will produce a result.

#### Attributes

##### tool\_name

The name of the output tool that was called. `None` if the result is from text content and not from a tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### tool\_call\_id

The tool call ID, if any, that this result is associated with.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['final\_result'\] **Default:** `'final_result'`

### ToolCallEvent

Base class for events emitted when a tool call is about to be invoked.

Match against this in a `case` to handle [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent) and [`OutputToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolCallEvent) together.

#### Attributes

##### part

The tool call to make.

**Type:** [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)

##### args\_valid

Whether the tool arguments passed validation. See the [custom validation docs](https://ai.pydantic.dev/tools-advanced/#args-validator) for more info.

-   `True`: Schema validation and custom validation (if configured) both passed; args are guaranteed valid.
-   `False`: Validation was performed and failed.
-   `None`: Validation was not performed.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_call\_id

An ID used for matching details about the call to its result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### FunctionToolCallEvent

**Bases:** [`ToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallEvent)

An event indicating the start to a call to a function tool.

#### Attributes

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['function\_tool\_call'\] **Default:** `'function_tool_call'`

##### call\_id

An ID used for matching details about the call to its result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### OutputToolCallEvent

**Bases:** [`ToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallEvent)

An event indicating the start of a call to an output tool (the model's "submit final answer" call).

#### Attributes

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['output\_tool\_call'\] **Default:** `'output_tool_call'`

### ToolResultEvent

Base class for events emitted when a tool call has been completed.

Match against this in a `case` to handle [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent) and [`OutputToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolResultEvent) together.

#### Attributes

##### part

The tool result part that will be sent back to the model.

**Type:** [`ToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolReturnPart) | [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart)

##### tool\_call\_id

An ID used to match the result to its original call.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### FunctionToolResultEvent

**Bases:** [`ToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolResultEvent)

An event indicating the result of a function tool call.

#### Attributes

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['function\_tool\_result'\] **Default:** `'function_tool_result'`

##### content

The content that will be sent to the model as a UserPromptPart following the result.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `content`

##### result

The tool result part that will be sent back to the model.

**Type:** [`ToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolReturnPart) | [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart)

### OutputToolResultEvent

**Bases:** [`ToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolResultEvent)

An event indicating the result of an output tool call.

#### Attributes

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['output\_tool\_result'\] **Default:** `'output_tool_result'`

### BuiltinToolCallEvent

An event indicating the start to a call to a built-in tool.

#### Attributes

##### part

The built-in tool call to make.

**Type:** [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart)

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin\_tool\_call'\] **Default:** `'builtin_tool_call'`

### BuiltinToolResultEvent

An event indicating the result of a built-in tool call.

#### Attributes

##### result

The result of the call to the built-in tool.

**Type:** [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart)

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['builtin\_tool\_result'\] **Default:** `'builtin_tool_result'`

### is\_multi\_modal\_content

```python
def is_multi_modal_content(obj: Any) -> TypeGuard[MultiModalContent]
```

Check if obj is a MultiModalContent type, enabling type narrowing.

#### Returns

[`TypeGuard`](https://docs.python.org/3/library/typing.html#typing.TypeGuard)\[[`MultiModalContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.MultiModalContent)\]

### FinishReason

Reason the model finished generating the response, normalized to OpenTelemetry values.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['stop', 'length', 'content_filter', 'tool_call', 'error']`

### ModelResponseState

Lifecycle state of a model response.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['complete', 'interrupted']`

### ForceDownloadMode

Type for the force\_download parameter on FileUrl subclasses.

-   `False`: The URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata).
-   `True`: The file is always downloaded with SSRF protection (blocks private IPs and cloud metadata).
-   `'allow-local'`: The file is always downloaded, allowing private IPs but still blocking cloud metadata.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `bool | Literal['allow-local']`

### ProviderDetailsDelta

Type for provider\_details input: can be a static dict, a callback to update existing details, or None.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[str, Any] | Callable[[dict[str, Any] | None], dict[str, Any]] | None`

### UploadedFileProviderName

Provider names supported by [`UploadedFile`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.UploadedFile).

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['anthropic', 'openai', 'google-gla', 'google-vertex', 'bedrock', 'xai']`

### MultiModalContent

Union of all multi-modal content types with a discriminator for Pydantic validation.

**Default:** `Annotated[ImageUrl | AudioUrl | DocumentUrl | VideoUrl | BinaryContent | UploadedFile, pydantic.Discriminator('kind')]`

### RETURN\_VALUE\_KEY

Key used to wrap non-dict tool return values in `model_response_object()`.

**Default:** `'return_value'`

### ModelRequestPart

A message part sent by Pydantic AI to a model.

**Default:** `Annotated[SystemPromptPart | UserPromptPart | ToolReturnPart | RetryPromptPart, pydantic.Discriminator('part_kind')]`

### ModelResponsePart

A message part returned by a model.

**Default:** `Annotated[TextPart | ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart | ThinkingPart | CompactionPart | FilePart, pydantic.Discriminator('part_kind')]`

### ModelMessage

Any message sent to or returned by a model.

**Default:** `Annotated[ModelRequest | ModelResponse, pydantic.Discriminator('kind')]`

### ModelMessagesTypeAdapter

Pydantic [`TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter) for (de)serializing messages.

**Default:** `pydantic.TypeAdapter(list[ModelMessage], config=(pydantic.ConfigDict(defer_build=True, ser_json_bytes='base64', val_json_bytes='base64')))`

### ModelResponsePartDelta

A partial update (delta) for any model response part.

**Default:** `Annotated[TextPartDelta | ThinkingPartDelta | ToolCallPartDelta, pydantic.Discriminator('part_delta_kind')]`

### ModelResponseStreamEvent

An event in the model response stream, starting a new part, applying a delta to an existing one, indicating a part is complete, or indicating the final result.

**Default:** `Annotated[PartStartEvent | PartDeltaEvent | PartEndEvent | FinalResultEvent, pydantic.Discriminator('event_kind')]`

### HandleResponseEvent

An event yielded when handling a model response, indicating tool calls and results.

**Default:** `Annotated[FunctionToolCallEvent | FunctionToolResultEvent | OutputToolCallEvent | OutputToolResultEvent | BuiltinToolCallEvent | BuiltinToolResultEvent, pydantic.Discriminator('event_kind')]`

### AgentStreamEvent

An event in the agent stream: model response stream events and response-handling events.

**Default:** `Annotated[ModelResponseStreamEvent | HandleResponseEvent, pydantic.Discriminator('event_kind')]`

---

# [pydantic_ai.output](https://pydantic.dev/docs/ai/api/pydantic-ai/output/)

# pydantic\_ai.output

### ToolOutput

**Bases:** `Generic[OutputDataT]`

Marker class to use a tool for output and optionally customize the tool.

Example:

tool\_output.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, ToolOutput


class Fruit(BaseModel):
    name: str
    color: str


class Vehicle(BaseModel):
    name: str
    wheels: int


agent = Agent(
    'openai:gpt-5.2',
    output_type=[
        ToolOutput(Fruit, name='return_fruit'),
        ToolOutput(Vehicle, name='return_vehicle'),
    ],
)
result = agent.run_sync('What is a banana?')
print(repr(result.output))
#> Fruit(name='banana', color='yellow')
```

#### Attributes

##### output

An output type or function.

**Type:** `OutputTypeOrFunction`\[`OutputDataT`\] **Default:** `type_`

##### name

The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### description

The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description`

##### max\_retries

Per-tool retry limit for this output tool.

Overrides the agent-level `output_retries` default (which itself acts as the per-tool default for output tools that do not specify their own limit). If not set, the agent-level value is used.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_retries`

##### strict

Whether to use strict mode for the tool.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `strict`

### NativeOutput

**Bases:** `Generic[OutputDataT]`

Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description.

Example:

native\_output.py

```python
from pydantic_ai import Agent, NativeOutput

from tool_output import Fruit, Vehicle

agent = Agent(
    'openai:gpt-5.2',
    output_type=NativeOutput(
        [Fruit, Vehicle],
        name='Fruit or vehicle',
        description='Return a fruit or vehicle.'
    ),
)
result = agent.run_sync('What is a Ford Explorer?')
print(repr(result.output))
#> Vehicle(name='Ford Explorer', wheels=4)
```

#### Attributes

##### outputs

The output types or functions.

**Type:** `OutputTypeOrFunction`\[`OutputDataT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`OutputTypeOrFunction`\[`OutputDataT`\]\] **Default:** `outputs`

##### name

The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### description

The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description`

##### strict

Whether to use strict mode for the output, if the model supports it.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `strict`

##### template

Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If no template is specified but the model's profile indicates that it requires the schema to be sent as a prompt, the default template specified on the profile will be used. Set to `False` to disable the schema prompt entirely.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `template`

### PromptedOutput

**Bases:** `Generic[OutputDataT]`

Marker class to use a prompt to tell the model what to output and optionally customize the prompt.

Example:

prompted\_output.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, PromptedOutput

from tool_output import Vehicle


class Device(BaseModel):
    name: str
    kind: str


agent = Agent(
    'openai:gpt-5.2',
    output_type=PromptedOutput(
        [Vehicle, Device],
        name='Vehicle or device',
        description='Return a vehicle or device.'
    ),
)
result = agent.run_sync('What is a MacBook?')
print(repr(result.output))
#> Device(name='MacBook', kind='laptop')

agent = Agent(
    'openai:gpt-5.2',
    output_type=PromptedOutput(
        [Vehicle, Device],
        template='Gimme some JSON: {schema}'
    ),
)
result = agent.run_sync('What is a Ford Explorer?')
print(repr(result.output))
#> Vehicle(name='Ford Explorer', wheels=4)
```

#### Attributes

##### outputs

The output types or functions.

**Type:** `OutputTypeOrFunction`\[`OutputDataT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`OutputTypeOrFunction`\[`OutputDataT`\]\] **Default:** `outputs`

##### name

The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### description

The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `description`

##### template

Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. Set to `False` to disable the schema prompt entirely.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `template`

### TextOutput

**Bases:** `Generic[OutputDataT]`

Marker class to use text output for an output function taking a string argument.

Example:

```python
from pydantic_ai import Agent, TextOutput


def split_into_words(text: str) -> list[str]:
    return text.split()


agent = Agent(
    'openai:gpt-5.2',
    output_type=TextOutput(split_into_words),
)
result = agent.run_sync('Who was Albert Einstein?')
print(result.output)
#> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.']
```

#### Attributes

##### output\_function

The function that will be called to process the model's plain text output. The function must take a single string argument.

**Type:** `TextOutputFunc`\[`OutputDataT`\]

### DeferredToolRequests

Tool calls that require approval or external execution.

This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools.

Results can be passed to the next agent run using a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with the same tool call IDs.

See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information.

#### Attributes

##### calls

Tool calls that require external execution.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))`

##### approvals

Tool calls that require human-in-the-loop approval.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))`

##### metadata

Metadata for deferred tool calls, keyed by `tool_call_id`.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))`

#### Methods

##### build\_results

```python
def build_results(
    approvals: dict[str, bool | DeferredToolApprovalResult] | None = None,
    calls: dict[str, DeferredToolCallResult | Any] | None = None,
    metadata: dict[str, dict[str, Any]] | None = None,
    approve_all: bool = False,
) -> DeferredToolResults
```

Create a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) for these requests.

###### Returns

[`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults)

###### Parameters

**`approvals`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`bool`](https://docs.python.org/3/library/functions.html#bool) | `DeferredToolApprovalResult`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Results for tool calls that required approval. Keys must match `tool_call_id`s in `self.approvals`.

**`calls`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `DeferredToolCallResult` | [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Results for tool calls that required external execution. Keys must match `tool_call_id`s in `self.calls`.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Per-call metadata, keyed by `tool_call_id`.

**`approve_all`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If `True`, every approval-requesting call not already listed in `approvals` is approved (with default `ToolApproved()`).

###### Raises

-   `ValueError` -- If a key in `approvals`/`calls` doesn't match a pending request of the appropriate kind.

##### remaining

```python
def remaining(results: DeferredToolResults) -> DeferredToolRequests | None
```

Return unresolved requests after applying results, or `None` if all resolved.

###### Returns

[`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) | [`None`](https://docs.python.org/3/library/constants.html#None)

### StructuredDict

```python
def StructuredDict(
    json_schema: JsonSchemaValue,
    name: str | None = None,
    description: str | None = None,
) -> type[JsonSchemaValue]
```

Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output.

Example:

structured\_dict.py

```python
from pydantic_ai import Agent, StructuredDict

schema = {
    'type': 'object',
    'properties': {
        'name': {'type': 'string'},
        'age': {'type': 'integer'}
    },
    'required': ['name', 'age']
}

agent = Agent('openai:gpt-5.2', output_type=StructuredDict(schema))
result = agent.run_sync('Create a person')
print(result.output)
#> {'name': 'John Doe', 'age': 30}
```

#### Returns

[`type`](https://docs.python.org/3/glossary.html#term-type)\[`JsonSchemaValue`\]

#### Parameters

**`json_schema`** : `JsonSchemaValue`

A JSON schema of type `object` defining the structure of the dictionary content.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name of the structured output. If not provided, the `title` field of the JSON schema will be used if it's present.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional description of the structured output. If not provided, the `description` field of the JSON schema will be used if it's present.

### OutputDataT

Covariant type variable for the output data type of a run.

**Default:** `TypeVar('OutputDataT', default=str, covariant=True)`

---

# [pydantic_ai.profiles](https://pydantic.dev/docs/ai/api/pydantic-ai/profiles/)

# pydantic\_ai.profiles

### ModelProfile

Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used.

#### Attributes

##### supports\_tools

Whether the model supports tools.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### supports\_tool\_return\_schema

Whether the model natively supports tool return schemas.

When True, the model's API accepts a structured return schema alongside each tool definition. When False, return schemas are injected as JSON text into tool descriptions as a fallback.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### supports\_json\_schema\_output

Whether the model supports JSON schema output.

This is also referred to as 'native' support for structured output. Relates to the `NativeOutput` output type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### supports\_json\_object\_output

Whether the model supports a dedicated mode to enforce JSON output, without necessarily sending a schema.

E.g. [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode) Relates to the `PromptedOutput` output type.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### supports\_image\_output

Whether the model supports image output.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### default\_structured\_output\_mode

The default structured output mode to use for the model.

**Type:** `StructuredOutputMode` **Default:** `'tool'`

##### prompted\_output\_template

The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `dedent("\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n ")`

##### native\_output\_requires\_schema\_in\_instructions

Whether to add prompted output template in native structured output mode

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### json\_schema\_transformer

The transformer to use to make JSON schemas for tools and structured output compatible with the model.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`JsonSchemaTransformer`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### supports\_thinking

Whether the model supports thinking/reasoning configuration.

When False, the unified `thinking` setting in `ModelSettings` is silently ignored.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### thinking\_always\_enabled

Whether the model always uses thinking/reasoning (e.g., OpenAI o-series, DeepSeek R1).

When True, `thinking=False` is silently ignored since the model cannot disable thinking. Implies `supports_thinking=True`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### thinking\_tags

The tags used to indicate thinking parts in the model's output. Defaults to ('<think>', '</think>').

**Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `('<think>', '</think>')`

##### ignore\_streamed\_leading\_whitespace

Whether to ignore leading whitespace when streaming a response.

This is a workaround for models that emit \`<think> </think>

`or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using`run\_stream`with`str`a valid`output\_type\`.

This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### supported\_builtin\_tools

The set of builtin tool types that this model/profile supports.

Defaults to ALL builtin tools. Profile functions should explicitly restrict this based on model capabilities.

**Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`AbstractBuiltinTool`\]\] **Default:** `field(default_factory=(lambda: SUPPORTED_BUILTIN_TOOLS))`

#### Methods

##### from\_profile

`@classmethod`

```python
def from_profile(cls, profile: ModelProfile | None) -> Self
```

Build a ModelProfile subclass instance from a ModelProfile instance.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

##### update

```python
def update(profile: ModelProfile | None) -> Self
```

Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

### OpenAIModelProfile

**Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

Profile for models used with `OpenAIChatModel`.

ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### openai\_chat\_thinking\_field

Non-standard field name used by some providers for model thinking content in Chat Completions API responses.

Plenty of providers use custom field names for thinking content. Ollama and newer versions of vLLM use `reasoning`, while DeepSeek, older vLLM and some others use `reasoning_content`.

Notice that the thinking field configured here is currently limited to `str` type content.

If `openai_chat_send_back_thinking_parts` is set to `'field'`, this field must be set to a non-None value.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### openai\_chat\_send\_back\_thinking\_parts

Whether the model includes thinking content in requests.

This can be:

-   `'auto'` (default): Automatically detects how to send thinking content. If thinking was received in a custom field (tracked via `ThinkingPart.id` and `ThinkingPart.provider_name`), it's sent back in that same field. Otherwise, it's sent using tags. Only the `reasoning` and `reasoning_content` fields are checked by default when receiving responses. If your provider uses a different field name, you must explicitly set `openai_chat_thinking_field` to that field name.
-   `'tags'`: The thinking content is included in the main `content` field, enclosed within thinking tags as specified in `thinking_tags` profile option.
-   `'field'`: The thinking content is included in a separate field specified by `openai_chat_thinking_field`.
-   `False`: No thinking content is sent in the request.

Defaults to `'auto'` to ensure thinking is sent back in the format expected by the model/provider.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['auto', 'tags', 'field', [`False`](https://docs.python.org/3/library/constants.html#False)\] **Default:** `'auto'`

##### openai\_supports\_strict\_tool\_definition

This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### openai\_supports\_sampling\_settings

Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### openai\_unsupported\_model\_settings

A list of model settings that are not supported by this model.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `()`

##### openai\_supports\_tool\_choice\_required

Whether the provider accepts the value `tool_choice='required'` in the request payload.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### openai\_system\_prompt\_role

The role to use for the system prompt message. If not provided, defaults to `'system'`.

**Type:** `OpenAISystemPromptRole` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### openai\_chat\_supports\_web\_search

Whether the model supports web search in Chat Completions API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_chat\_audio\_input\_encoding

The encoding to use for audio input in Chat Completions requests.

-   `'base64'`: Raw base64 encoded string. (Default, used by OpenAI)
-   `'uri'`: Data URI (e.g. `data:audio/wav;base64,...`).

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['base64', 'uri'\] **Default:** `'base64'`

##### openai\_chat\_supports\_file\_urls

Whether the Chat API supports file URLs directly in the `file_data` field.

OpenAI's native Chat API only supports base64-encoded data, but some providers like OpenRouter support passing URLs directly.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_supports\_encrypted\_reasoning\_content

Whether the model supports including encrypted reasoning content in the response.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_supports\_reasoning

Whether the model supports reasoning (o-series, GPT-5+).

When True, sampling parameters may need to be dropped depending on reasoning\_effort setting.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_supports\_reasoning\_effort\_none

Whether the model supports sampling parameters (temperature, top\_p, etc.) when reasoning\_effort='none'.

Models like GPT-5.1 and GPT-5.2 default to reasoning\_effort='none' and support sampling params in that mode. When reasoning is enabled (low/medium/high/xhigh), sampling params are not supported.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_responses\_requires\_function\_call\_status\_none

Whether the Responses API requires the `status` field on function tool calls to be `None`.

This is required by vLLM Responses API versions before [https://github.com/vllm-project/vllm/pull/26706](https://github.com/vllm-project/vllm/pull/26706). See [https://github.com/pydantic/pydantic-ai/issues/3245](https://github.com/pydantic/pydantic-ai/issues/3245) for more details.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_supports\_phase

Whether the Responses API supports the `phase` field on assistant messages.

`phase` labels an assistant message as intermediate `commentary` or the `final_answer`. When the model supports it, OpenAI recommends preserving and sending it back unchanged on every assistant message in follow-up requests; dropping it can cause preambles to be interpreted as final answers and degrade behavior in long-running or tool-heavy flows.

Supported by `gpt-5.3-codex`, `gpt-5.4` and later mainline models. The official OpenAI Responses API silently ignores the field on older models, but defaults to `False` so we don't risk sending an unrecognized field to OpenAI-compatible APIs (vLLM, Bifrost, ...) that haven't been verified to accept it.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### openai\_chat\_supports\_document\_input

Whether the Chat Completions API supports document content parts (`type='file'`).

Some OpenAI-compatible providers (e.g. Azure) do not support document input via the Chat Completions API.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

### OpenAIJsonSchemaTransformer

**Bases:** `JsonSchemaTransformer`

Recursively handle the schema to make it compatible with OpenAI strict mode.

See [https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode) for more details, but this basically just requires:

-   `additionalProperties` must be set to false for each object in the parameters
-   all fields in properties must be marked as required

### openai\_model\_profile

```python
def openai_model_profile(model_name: str) -> ModelProfile
```

Get the model profile for an OpenAI model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

### OPENAI\_REASONING\_EFFORT\_MAP

Maps unified thinking values to OpenAI reasoning\_effort strings.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `{True: 'medium', False: 'none', 'minimal': 'minimal', 'low': 'low', 'medium': 'medium', 'high': 'high', 'xhigh': 'xhigh'}`

### SAMPLING\_PARAMS

Sampling parameter names that are incompatible with reasoning.

These parameters are not supported when reasoning is enabled (reasoning\_effort != 'none'). See [https://platform.openai.com/docs/guides/reasoning](https://platform.openai.com/docs/guides/reasoning) for details.

**Default:** `('temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'openai_logprobs', 'openai_top_logprobs')`

### AnthropicModelProfile

**Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

Profile for models used with `AnthropicModel`.

ALL FIELDS MUST BE `anthropic_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### anthropic\_supports\_fast\_speed

Whether the model supports fast inference speed (`anthropic_speed='fast'`).

Currently only Claude Opus 4.6 supports fast mode. See the Anthropic docs for the latest list.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_supports\_adaptive\_thinking

Whether the model supports adaptive thinking (Sonnet 4.6+, Opus 4.6+).

When True, unified `thinking` translates to `{'type': 'adaptive'}`. When False, it translates to `{'type': 'enabled', 'budget_tokens': N}`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_supports\_effort

Whether the model supports the `effort` parameter in `output_config` (Opus 4.5+, Sonnet 4.6+).

When True and the unified thinking level is a string (e.g. 'high'), it is also mapped to `output_config.effort`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_supports\_xhigh\_effort

Whether the model supports the `xhigh` effort value in `output_config`.

Claude Opus 4.7 adds `xhigh`; older Anthropic models should use `max` instead.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_disallows\_budget\_thinking

Whether the model rejects budget-based thinking settings.

Claude Opus 4.7+ requires adaptive thinking and returns a 400 for `{'type': 'enabled', 'budget_tokens': ...}`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_disallows\_sampling\_settings

Whether the model rejects sampling settings like `temperature` and `top_p`.

Claude Opus 4.7+ requires these settings to be omitted from request payloads.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### anthropic\_default\_code\_execution\_tool\_version

The Anthropic code execution tool version used when `anthropic_code_execution_tool_version='auto'`.

**Type:** `AnthropicCodeExecutionToolVersion` **Default:** `'20250825'`

##### anthropic\_supported\_code\_execution\_tool\_versions

The Anthropic code execution tool versions supported by the model.

**Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`AnthropicCodeExecutionToolVersion`, ...\] **Default:** `('20250825',)`

##### anthropic\_supports\_task\_budgets

Whether the model supports `output_config.task_budget`.

Anthropic currently documents task budgets as a Claude Opus 4.7 beta feature.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

### anthropic\_model\_profile

```python
def anthropic_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for an Anthropic model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### AnthropicCodeExecutionToolVersion

Concrete Anthropic code execution tool version to send for `CodeExecutionTool`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['20250825', '20260120']`

### ANTHROPIC\_THINKING\_BUDGET\_MAP

Maps unified thinking values to Anthropic budget\_tokens for non-adaptive models.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`ThinkingLevel`, [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `{True: 10000, 'minimal': 1024, 'low': 2048, 'medium': 10000, 'high': 16384, 'xhigh': 32768}`

### GoogleModelProfile

**Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

Profile for models used with `GoogleModel`.

ALL FIELDS MUST BE `google_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### google\_supports\_native\_output\_with\_builtin\_tools

Whether the model supports native output with builtin tools. See [https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured\_outputs\_with\_tools](https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### google\_supported\_mime\_types\_in\_tool\_returns

MIME types supported in native FunctionResponseDict.parts. See [https://ai.google.dev/gemini-api/docs/function-calling#multimodal-function-responses](https://ai.google.dev/gemini-api/docs/function-calling#multimodal-function-responses)

**Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), ...\] **Default:** `()`

##### google\_supports\_thinking\_level

Whether the model uses `thinking_level` (enum: LOW/MEDIUM/HIGH) instead of `thinking_budget` (int).

Gemini 3+ models use `thinking_level`; Gemini 2.5 uses `thinking_budget`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

### GoogleJsonSchemaTransformer

**Bases:** `JsonSchemaTransformer`

Transforms the JSON Schema from Pydantic to be suitable for Gemini.

Gemini supports [a subset of OpenAPI v3.0.3](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations).

### google\_model\_profile

```python
def google_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Google model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### meta\_model\_profile

```python
def meta_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Meta model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### amazon\_model\_profile

```python
def amazon_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for an Amazon model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### deepseek\_model\_profile

```python
def deepseek_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a DeepSeek model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### GrokModelProfile

**Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

Profile for Grok models (used with both GrokProvider and XaiProvider).

ALL FIELDS MUST BE `grok_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### grok\_supports\_builtin\_tools

Whether the model supports builtin tools (web\_search, x\_search, code\_execution, mcp).

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### grok\_supports\_tool\_choice\_required

Whether the provider accepts the value `tool_choice='required'` in the request payload.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

### grok\_model\_profile

```python
def grok_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Grok model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### mistral\_model\_profile

```python
def mistral_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Mistral model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### qwen\_model\_profile

```python
def qwen_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Qwen model.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

---

# [pydantic_ai.providers](https://pydantic.dev/docs/ai/api/pydantic-ai/providers/)

# pydantic\_ai.providers

### Provider

**Bases:** `ABC`, `Generic[InterfaceClient]`

Abstract class for a provider.

The provider is in charge of providing an authenticated client to the API.

Each provider only supports a specific interface. An interface can be supported by multiple providers.

For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`.

When used as an async context manager, providers that create their own HTTP client will close it on exit. This is handled automatically when using [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) as a context manager.

#### Attributes

##### name

The provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### base\_url

The base URL for the provider API.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### client

The client for the provider.

**Type:** `InterfaceClient`

#### Methods

##### model\_profile

`@staticmethod`

```python
def model_profile(model_name: str) -> ModelProfile | None
```

The model profile for the named model, if available.

###### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### gateway\_provider

```python
def gateway_provider(
    upstream_provider: Literal['openai', 'openai-chat', 'openai-responses', 'chat', 'responses'],
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> Provider[AsyncOpenAI]
def gateway_provider(
    upstream_provider: Literal['groq'],
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> Provider[AsyncGroq]
def gateway_provider(
    upstream_provider: Literal['anthropic'],
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> Provider[AsyncAnthropicClient]
def gateway_provider(
    upstream_provider: Literal['bedrock', 'converse'],
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
) -> Provider[BaseClient]
def gateway_provider(
    upstream_provider: Literal['gemini', 'google-vertex'],
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> Provider[GoogleClient]
def gateway_provider(
    upstream_provider: str,
    route: str | None = None,
    api_key: str | None = None,
    base_url: str | None = None,
) -> Provider[Any]
```

Create a new Gateway provider.

#### Returns

`Provider`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

#### Parameters

**`upstream_provider`** : `UpstreamProvider` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The upstream provider to use.

**`route`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the provider or routing group to use to handle the request. If not provided, the default routing group for the API format will be used.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication. If not provided, the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable will be used if available.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base URL to use for the Gateway. If not provided, the `PYDANTIC_AI_GATEWAY_BASE_URL` environment variable will be used if available. Otherwise, defaults to `https://gateway.pydantic.dev/proxy`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The HTTP client to use for the Gateway.

### AnthropicProvider

**Bases:** `Provider[AsyncAnthropicClient]`

Provider for Anthropic API.

#### Methods

##### \_\_init\_\_

```python
def __init__(anthropic_client: AsyncAnthropicClient | None = None) -> None
def __init__(
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Anthropic provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `ANTHROPIC_API_KEY` environment variable will be used if available.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base URL to use for the Anthropic API.

**`anthropic_client`** : `AsyncAnthropicClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing Anthropic client to use. Accepts [`AsyncAnthropic`](https://github.com/anthropics/anthropic-sdk-python), [`AsyncAnthropicBedrock`](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock), [`AsyncAnthropicFoundry`](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry), or [`AsyncAnthropicVertex`](https://docs.anthropic.com/en/api/claude-on-vertex-ai). If provided, the `api_key` and `http_client` arguments will be ignored.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### GoogleProvider

**Bases:** `Provider[Client]`

Provider for Google.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    api_key: str,
    http_client: httpx.AsyncClient | None = None,
    base_url: str | None = None,
) -> None
def __init__(
    credentials: Credentials | None = None,
    project: str | None = None,
    location: VertexAILocation | Literal['global'] | str | None = None,
    http_client: httpx.AsyncClient | None = None,
    base_url: str | None = None,
) -> None
def __init__(client: Client) -> None
def __init__(
    vertexai: bool = False,
    api_key: str | None = None,
    http_client: httpx.AsyncClient | None = None,
    base_url: str | None = None,
) -> None
```

Create a new Google provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The `API key <https://ai.google.dev/gemini-api/docs/api-key>`\_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable.

**`credentials`** : `Credentials` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only.

**`project`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE\_CLOUD\_PROJECT). Applies to the Vertex AI API only.

**`location`** : `VertexAILocation` | [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['global'\] | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only.

**`vertexai`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False` unless `location`, `project`, or `credentials` are provided.

**`client`** : `Client` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A pre-initialized client to use.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base URL for the Google API.

### VertexAILocation

Regions available for Vertex AI. More details [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#genai-locations).

**Default:** `Literal['asia-east1', 'asia-east2', 'asia-northeast1', 'asia-northeast3', 'asia-south1', 'asia-southeast1', 'australia-southeast1', 'europe-central2', 'europe-north1', 'europe-southwest1', 'europe-west1', 'europe-west2', 'europe-west3', 'europe-west4', 'europe-west6', 'europe-west8', 'europe-west9', 'me-central1', 'me-central2', 'me-west1', 'northamerica-northeast1', 'southamerica-east1', 'us-central1', 'us-east1', 'us-east4', 'us-east5', 'us-south1', 'us-west1', 'us-west4']`

### OpenAIProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for OpenAI API.

#### Methods

##### \_\_init\_\_

```python
def __init__(openai_client: AsyncOpenAI) -> None
def __init__(
    base_url: str | None = None,
    api_key: str | None = None,
    openai_client: None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new OpenAI provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### XaiProvider

**Bases:** `Provider[AsyncClient]`

Provider for xAI API (native xAI SDK).

#### Methods

##### \_\_init\_\_

```python
def __init__() -> None
def __init__(api_key: str) -> None
def __init__(xai_client: AsyncClient) -> None
```

Create a new xAI provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `XAI_API_KEY` environment variable will be used if available.

**`xai_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `xai_sdk.AsyncClient` to use. This takes precedence over `api_key`.

### DeepSeekProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for DeepSeek API.

### BedrockModelProfile

**Bases:** [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile)

Profile for models used with BedrockModel.

ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS.

#### Attributes

##### bedrock\_thinking\_variant

Which thinking API shape to use for unified thinking translation.

-   `'anthropic'`: Uses `{'thinking': {'type': 'enabled', 'budget_tokens': N}}`
-   `'openai'`: Uses `{'reasoning_effort': 'low'|'medium'|'high'}`
-   `'qwen'`: Uses `{'reasoning_config': 'low'|'high'}`
-   `None`: No unified thinking support.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['anthropic', 'openai', 'qwen'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### BedrockProvider

**Bases:** `Provider[BaseClient]`

Provider for AWS Bedrock.

#### Methods

##### \_\_init\_\_

```python
def __init__(bedrock_client: BaseClient) -> None
def __init__(
    api_key: str,
    base_url: str | None = None,
    region_name: str | None = None,
    profile_name: str | None = None,
    aws_read_timeout: float | None = None,
    aws_connect_timeout: float | None = None,
) -> None
def __init__(
    aws_access_key_id: str | None = None,
    aws_secret_access_key: str | None = None,
    aws_session_token: str | None = None,
    base_url: str | None = None,
    region_name: str | None = None,
    profile_name: str | None = None,
    aws_read_timeout: float | None = None,
    aws_connect_timeout: float | None = None,
) -> None
```

Initialize the Bedrock provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`bedrock_client`** : `BaseClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A boto3 client for Bedrock Runtime. If provided, other arguments are ignored.

**`aws_access_key_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The AWS access key ID. If not set, the `AWS_ACCESS_KEY_ID` environment variable will be used if available.

**`aws_secret_access_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The AWS secret access key. If not set, the `AWS_SECRET_ACCESS_KEY` environment variable will be used if available.

**`aws_session_token`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The AWS session token. If not set, the `AWS_SESSION_TOKEN` environment variable will be used if available.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key for Bedrock client. Can be used instead of `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token`. If not set, the `AWS_BEARER_TOKEN_BEDROCK` environment variable will be used if available.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base URL for the Bedrock client.

**`region_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The AWS region name. If not set, the `AWS_DEFAULT_REGION` environment variable will be used if available.

**`profile_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The AWS profile name.

**`aws_read_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The read timeout for Bedrock client.

**`aws_connect_timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The connect timeout for Bedrock client.

### bedrock\_amazon\_model\_profile

```python
def bedrock_amazon_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for an Amazon model used via Bedrock.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### bedrock\_deepseek\_model\_profile

```python
def bedrock_deepseek_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a DeepSeek model used via Bedrock.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### remove\_bedrock\_geo\_prefix

```python
def remove_bedrock_geo_prefix(model_name: str) -> str
```

Remove inference geographic prefix from model ID if present.

Bedrock supports cross-region inference using geographic prefixes like 'us.', 'eu.', 'apac.', etc. This function strips those prefixes.

#### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### GroqProvider

**Bases:** `Provider[AsyncGroq]`

Provider for Groq API.

#### Methods

##### \_\_init\_\_

```python
def __init__(groq_client: AsyncGroq | None = None) -> None
def __init__(
    api_key: str | None = None,
    base_url: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Groq provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url.

**`groq_client`** : `AsyncGroq` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `AsyncClient` to use for making HTTP requests.

### groq\_moonshotai\_model\_profile

```python
def groq_moonshotai_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for an MoonshotAI model used with the Groq provider.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### meta\_groq\_model\_profile

```python
def meta_groq_model_profile(model_name: str) -> ModelProfile | None
```

Get the model profile for a Meta model used with the Groq provider.

#### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

### AzureProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Azure OpenAI API.

See [https://azure.microsoft.com/en-us/products/ai-foundry](https://azure.microsoft.com/en-us/products/ai-foundry) for more information.

#### Methods

##### \_\_init\_\_

```python
def __init__(openai_client: AsyncAzureOpenAI) -> None
def __init__(
    azure_endpoint: str | None = None,
    api_version: str | None = None,
    api_key: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Azure provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`azure_endpoint`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available.

**`api_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. Not required (and not sent) when `azure_endpoint` targets the [Azure OpenAI v1 GA API](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle) (i.e. a path ending in `/v1`, such as `https://<resource>.openai.azure.com/openai/v1/`) or an Azure AI Foundry serverless model endpoint (`*.models.ai.azure.com`), both of which reject the `api-version` query parameter.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available.

**`openai_client`** : `AsyncAzureOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### CohereProvider

**Bases:** `Provider[AsyncClientV2]`

Provider for Cohere API.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    api_key: str | None = None,
    cohere_client: AsyncClientV2 | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Cohere provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available.

**`cohere_client`** : `AsyncClientV2` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### VoyageAIProvider

**Bases:** `Provider[AsyncClient]`

Provider for VoyageAI API.

#### Methods

##### \_\_init\_\_

```python
def __init__(voyageai_client: AsyncClient) -> None
def __init__(api_key: str | None = None) -> None
```

Create a new VoyageAI provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `VOYAGE_API_KEY` environment variable will be used if available.

**`voyageai_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [AsyncClient](https://github.com/voyage-ai/voyageai-python) client to use. If provided, `api_key` must be `None`.

### CerebrasProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Cerebras API.

#### Methods

##### \_\_init\_\_

```python
def __init__() -> None
def __init__(api_key: str) -> None
def __init__(api_key: str, http_client: httpx.AsyncClient) -> None
def __init__(http_client: httpx.AsyncClient) -> None
def __init__(openai_client: AsyncOpenAI | None = None) -> None
```

Create a new Cerebras provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `CEREBRAS_API_KEY` environment variable will be used if available.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### MistralProvider

**Bases:** `Provider[Mistral]`

Provider for Mistral API.

#### Methods

##### \_\_init\_\_

```python
def __init__(mistral_client: Mistral | None = None) -> None
def __init__(
    api_key: str | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Mistral provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available.

**`mistral_client`** : `Mistral` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base url for the Mistral requests.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing async client to use for making HTTP requests.

### FireworksProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Fireworks AI API.

### GrokProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Grok API (OpenAI-compatible interface).

### TogetherProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Together AI API.

### HerokuProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Heroku API.

### GitHubProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for GitHub Models API.

GitHub Models provides access to various AI models through an OpenAI-compatible API. See [https://docs.github.com/en/github-models](https://docs.github.com/en/github-models) for more information.

#### Methods

##### \_\_init\_\_

```python
def __init__() -> None
def __init__(api_key: str) -> None
def __init__(api_key: str, http_client: httpx.AsyncClient) -> None
def __init__(openai_client: AsyncOpenAI | None = None) -> None
```

Create a new GitHub Models provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### OpenRouterProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for OpenRouter API.

#### Methods

##### \_\_init\_\_

```python
def __init__(openai_client: AsyncOpenAI) -> None
def __init__(
    api_key: str | None = None,
    app_url: str | None = None,
    app_title: str | None = None,
    openai_client: None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Configure the provider with either an API key or prebuilt client.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

OpenRouter API key. Falls back to `OPENROUTER_API_KEY` when omitted and required unless `openai_client` is provided.

**`app_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional url for app attribution. Falls back to `OPENROUTER_APP_URL` when omitted.

**`app_title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional title for app attribution. Falls back to `OPENROUTER_APP_TITLE` when omitted.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Existing `AsyncOpenAI` client to reuse instead of creating one internally.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom `httpx.AsyncClient` to pass into the `AsyncOpenAI` constructor when building a client.

###### Raises

-   `UserError` -- If no API key is available and no `openai_client` is provided.

### VercelProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Vercel AI Gateway API.

### HuggingFaceProvider

**Bases:** `Provider[AsyncInferenceClient]`

Provider for Hugging Face.

#### Methods

##### \_\_init\_\_

```python
def __init__(base_url: str, api_key: str | None = None) -> None
def __init__(provider_name: str, api_key: str | None = None) -> None
def __init__(hf_client: AsyncInferenceClient, api_key: str | None = None) -> None
def __init__(
    hf_client: AsyncInferenceClient,
    base_url: str,
    api_key: str | None = None,
) -> None
def __init__(
    hf_client: AsyncInferenceClient,
    provider_name: str,
    api_key: str | None = None,
) -> None
def __init__(api_key: str | None = None) -> None
```

Create a new Hugging Face provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base url for the Hugging Face requests.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available.

**`hf_client`** : `AsyncInferenceClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created.

**`http_client`** : `AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

(currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests.

**`provider_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in [https://hf.co/settings/inference-providers](https://hf.co/settings/inference-providers). If `base_url` is passed, then `provider_name` is not used.

### MoonshotAIProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for MoonshotAI platform (Kimi models).

### OllamaProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for local or remote Ollama API.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    base_url: str | None = None,
    api_key: str | None = None,
    openai_client: AsyncOpenAI | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Create a new Ollama provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available.

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`.

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An existing `httpx.AsyncClient` to use for making HTTP requests.

### LiteLLMProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for LiteLLM API.

#### Methods

##### \_\_init\_\_

```python
def __init__(api_key: str | None = None, api_base: str | None = None) -> None
def __init__(
    api_key: str | None = None,
    api_base: str | None = None,
    http_client: AsyncHTTPClient,
) -> None
def __init__(openai_client: AsyncOpenAI) -> None
```

Initialize a LiteLLM provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

API key for the model provider. If None, LiteLLM will try to get it from environment variables.

**`api_base`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Base URL for the model provider. Use this for custom endpoints or self-hosted models.

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Pre-configured OpenAI client. If provided, other parameters are ignored.

**`http_client`** : `AsyncHTTPClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom HTTP client to use.

### NebiusProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Nebius AI Studio API.

### OVHcloudProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for OVHcloud AI Endpoints.

### AlibabaProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for Alibaba Cloud Model Studio (DashScope) OpenAI-compatible API.

### SambaNovaProvider

**Bases:** `Provider[AsyncOpenAI]`

Provider for SambaNova AI models.

SambaNova uses an OpenAI-compatible API.

#### Attributes

##### name

Return the provider name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### base\_url

Return the base URL.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### client

Return the AsyncOpenAI client.

**Type:** `AsyncOpenAI`

#### Methods

##### model\_profile

`@staticmethod`

```python
def model_profile(model_name: str) -> ModelProfile | None
```

Get model profile for SambaNova models.

SambaNova serves models from multiple families including Meta Llama, DeepSeek, Qwen, and Mistral. Model profiles are matched based on model name prefixes.

###### Returns

[`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### \_\_init\_\_

```python
def __init__(
    api_key: str | None = None,
    base_url: str | None = None,
    openai_client: AsyncOpenAI | None = None,
    http_client: httpx.AsyncClient | None = None,
) -> None
```

Initialize SambaNova provider.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`api_key`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

SambaNova API key. If not provided, reads from SAMBANOVA\_API\_KEY env var.

**`base_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom API base URL. Defaults to [https://api.sambanova.ai/v1](https://api.sambanova.ai/v1)

**`openai_client`** : `AsyncOpenAI` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional pre-configured OpenAI client

**`http_client`** : `httpx.AsyncClient` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom httpx.AsyncClient for making HTTP requests

###### Raises

-   `UserError` -- If API key is not provided and SAMBANOVA\_API\_KEY env var is not set

---

# [pydantic_ai.result](https://pydantic.dev/docs/ai/api/pydantic-ai/result/)

# pydantic\_ai.result

### StreamedRunResult

**Bases:** `Generic[AgentDepsT, OutputDataT]`

Result of a streamed run that returns structured data via a tool call.

#### Attributes

##### is\_complete

Whether the stream has all been received.

This is set to `True` when one of [`stream_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output), [`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text), [`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) or [`get_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.get_output) completes.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `field(default=False, init=False)`

##### response

Return the current state of the response.

**Type:** `_messages.ModelResponse`

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### cancelled

Whether the stream has been cancelled via `cancel()`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### all\_messages

```python
def all_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the history of \_messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages

```python
def new_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the messages produced during this run.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages\_json

```python
def new_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### stream

`@async`

`@deprecated`

```python
def stream(debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputDataT`\]

##### stream\_output

`@async`

```python
def stream_output(debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]
```

Stream the output as an async iterable.

The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputDataT`\] -- An async iterable of the response data.

###### Parameters

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received.

##### stream\_text

`@async`

```python
def stream_text(
    delta: bool = False,
    debounce_by: float | None = 0.1,
) -> AsyncIterator[str]
```

Stream the text result as an async iterable.

Note

Result validators will NOT be called on the text result if `delta=True`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

###### Parameters

**`delta`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point.

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received.

##### stream\_structured

`@async`

`@deprecated`

```python
def stream_structured(
    debounce_by: float | None = 0.1,
) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]
```

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\]

##### stream\_responses

`@async`

```python
def stream_responses(
    debounce_by: float | None = 0.1,
) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]
```

Stream the response as an async iterable of Structured LLM Messages.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\] -- An async iterable of the structured response message and whether that is the last message.

###### Parameters

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received.

##### get\_output

`@async`

```python
def get_output() -> OutputDataT
```

Stream the whole response, validate and return it.

###### Returns

`OutputDataT`

##### usage

```python
def usage() -> RunUsage
```

Return the usage of the whole run.

Note

This won't return the full usage until the stream is finished.

###### Returns

[`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage)

##### timestamp

```python
def timestamp() -> datetime
```

Get the timestamp of the response.

###### Returns

[`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

##### validate\_structured\_output

`@async`

`@deprecated`

```python
def validate_structured_output(
    message: _messages.ModelResponse,
    allow_partial: bool = False,
) -> OutputDataT
```

###### Returns

`OutputDataT`

##### validate\_response\_output

`@async`

```python
def validate_response_output(
    message: _messages.ModelResponse,
    allow_partial: bool = False,
) -> OutputDataT
```

Validate a structured result message.

###### Returns

`OutputDataT`

##### cancel

`@async`

```python
def cancel() -> None
```

Cancel the stream, stopping token generation and closing the underlying connection.

The interrupted response state is recorded in the message history so that `all_messages()` includes it.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### StreamedRunResultSync

**Bases:** `Generic[AgentDepsT, OutputDataT]`

Synchronous wrapper for [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) that only exposes sync methods.

#### Attributes

##### response

Return the current state of the response.

**Type:** `_messages.ModelResponse`

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### is\_complete

Whether the stream has all been received.

This is set to `True` when one of [`stream_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_output), [`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_text), [`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.stream_responses) or [`get_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.get_output) completes.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### all\_messages

```python
def all_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the history of messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages

```python
def new_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the messages produced during this run.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages\_json

```python
def new_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### stream\_output

```python
def stream_output(debounce_by: float | None = 0.1) -> Iterator[OutputDataT]
```

Stream the output as an iterable.

The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[`OutputDataT`\] -- An iterable of the response data.

###### Parameters

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received.

##### stream\_text

```python
def stream_text(delta: bool = False, debounce_by: float | None = 0.1) -> Iterator[str]
```

Stream the text result as an iterable.

Note

Result validators will NOT be called on the text result if `delta=True`.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

###### Parameters

**`delta`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point.

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received.

##### stream\_responses

```python
def stream_responses(
    debounce_by: float | None = 0.1,
) -> Iterator[tuple[_messages.ModelResponse, bool]]
```

Stream the response as an iterable of Structured LLM Messages.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`_messages.ModelResponse`, [`bool`](https://docs.python.org/3/library/functions.html#bool)\]\] -- An iterable of the structured response message and whether that is the last message.

###### Parameters

**`debounce_by`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `0.1`

by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received.

##### get\_output

```python
def get_output() -> OutputDataT
```

Stream the whole response, validate and return it.

###### Returns

`OutputDataT`

##### usage

```python
def usage() -> RunUsage
```

Return the usage of the whole run.

Note

This won't return the full usage until the stream is finished.

###### Returns

[`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage)

##### timestamp

```python
def timestamp() -> datetime
```

Get the timestamp of the response.

###### Returns

[`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

##### validate\_response\_output

```python
def validate_response_output(
    message: _messages.ModelResponse,
    allow_partial: bool = False,
) -> OutputDataT
```

Validate a structured result message.

###### Returns

`OutputDataT`

---

# [pydantic_ai.retries](https://pydantic.dev/docs/ai/api/pydantic-ai/retries/)

# pydantic\_ai.retries

Retries utilities based on tenacity, especially for HTTP requests.

This module provides HTTP transport wrappers and wait strategies that integrate with the tenacity library to add retry capabilities to HTTP requests. The transports can be used with HTTP clients that support custom transports (such as httpx), while the wait strategies can be used with any tenacity retry decorator.

The module includes:

-   TenacityTransport: Synchronous HTTP transport with retry capabilities
-   AsyncTenacityTransport: Asynchronous HTTP transport with retry capabilities
-   wait\_retry\_after: Wait strategy that respects HTTP Retry-After headers

### RetryConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

The configuration for tenacity-based retrying.

These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar.

All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used.

#### Attributes

##### sleep

A sleep strategy to use for sleeping between retries.

Tenacity's default for this argument is `tenacity.nap.sleep`.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float)\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\]

##### stop

A stop strategy to determine when to stop retrying.

Tenacity's default for this argument is `tenacity.stop.stop_never`.

**Type:** `StopBaseT`

##### wait

A wait strategy to determine how long to wait between retries.

Tenacity's default for this argument is `tenacity.wait.wait_none`.

**Type:** `WaitBaseT`

##### retry

A retry strategy to determine which exceptions should trigger a retry.

Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`.

**Type:** `SyncRetryBaseT` | `RetryBaseT`

##### before

A callable that is called before each retry attempt.

Tenacity's default for this argument is `tenacity.before.before_nothing`.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\]

##### after

A callable that is called after each retry attempt.

Tenacity's default for this argument is `tenacity.after.after_nothing`.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\]

##### before\_sleep

An optional callable that is called before sleeping between retries.

Tenacity's default for this argument is `None`.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`None`](https://docs.python.org/3/library/constants.html#None) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### reraise

Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead.

Tenacity's default for this argument is `False`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### retry\_error\_cls

The exception class to raise when the retry attempts are exhausted and `reraise` is False.

Tenacity's default for this argument is `tenacity.RetryError`.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RetryError`\]

##### retry\_error\_callback

An optional callable that is called when the retry attempts are exhausted and `reraise` is False.

Tenacity's default for this argument is `None`.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None)

### TenacityTransport

**Bases:** `BaseTransport`

Synchronous HTTP transport with tenacity-based retry functionality.

This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic.

The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate\_response function can be used to convert HTTP responses into exceptions that trigger retries.

#### Constructor Parameters

**`wrapped`** : `BaseTransport` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The underlying transport to wrap and add retry functionality to.

**`config`** : `RetryConfig`

The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info.

**`validate_response`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`Response`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed.

#### Methods

##### handle\_request

```python
def handle_request(request: Request) -> Response
```

Handle an HTTP request with retry logic.

###### Returns

`Response` -- The HTTP response.

###### Parameters

**`request`** : `Request`

The HTTP request to handle.

###### Raises

-   `RuntimeError` -- If the retry controller did not make any attempts.
-   `Exception` -- Any exception raised by the wrapped transport or validation function.

### AsyncTenacityTransport

**Bases:** `AsyncBaseTransport`

Asynchronous HTTP transport with tenacity-based retry functionality.

This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic.

The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate\_response function can be used to convert HTTP responses into exceptions that trigger retries.

#### Constructor Parameters

**`wrapped`** : `AsyncBaseTransport` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The underlying async transport to wrap and add retry functionality to.

**`config`** : `RetryConfig`

The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info.

**`validate_response`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`Response`\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed.

#### Methods

##### handle\_async\_request

`@async`

```python
def handle_async_request(request: Request) -> Response
```

Handle an async HTTP request with retry logic.

###### Returns

`Response` -- The HTTP response.

###### Parameters

**`request`** : `Request`

The HTTP request to handle.

###### Raises

-   `RuntimeError` -- If the retry controller did not make any attempts.
-   `Exception` -- Any exception raised by the wrapped transport or validation function.

### wait\_retry\_after

```python
def wait_retry_after(
    fallback_strategy: Callable[[RetryCallState], float] | None = None,
    max_wait: float = 300,
) -> Callable[[RetryCallState], float]
```

Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers.

This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy.

The Retry-After header can be in two formats:

-   An integer representing seconds to wait
-   An HTTP date string representing when to retry

#### Returns

[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`float`](https://docs.python.org/3/library/functions.html#float)\] -- A wait function that can be used with tenacity retry decorators.

#### Parameters

**`fallback_strategy`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`RetryCallState`\], [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s.

**`max_wait`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `300`

Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes).

---

# [pydantic_ai.run](https://pydantic.dev/docs/ai/api/pydantic-ai/run/)

# pydantic\_ai.run

### AgentRun

**Bases:** `Generic[AgentDepsT, OutputDataT]`

A stateful, async-iterable run of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent).

You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`.

Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) is reached, the run finishes and [`result`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.result) becomes available.

Example:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    nodes = []
    # Iterate through the run, recording each node along the way:
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            nodes.append(node)
    print(nodes)
    '''
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    '''
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

You can also manually drive the iteration using the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method for more granular control.

#### Attributes

##### ctx

The current context of the agent run.

**Type:** `GraphRunContext`\[`_agent_graph.GraphAgentState`, `_agent_graph.GraphAgentDeps`\[`AgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

##### next\_node

The next node that will be run in the agent graph.

This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`.

**Type:** `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]

##### result

The final result of the run if it has ended, otherwise `None`.

Once the run returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node, `result` is populated with an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult).

**Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### all\_messages

```python
def all_messages() -> list[_messages.ModelMessage]
```

Return all messages for the run so far.

Messages from older runs are included.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

##### new\_messages

```python
def new_messages() -> list[_messages.ModelMessage]
```

Return the messages produced during this run so far.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\]

##### new\_messages\_json

```python
def new_messages_json() -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

##### \_\_aiter\_\_

```python
def __aiter__(

) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]
```

Provide async-iteration over the nodes in the agent run.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]\]

##### \_\_anext\_\_

`@async`

```python
def __anext__(

) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]
```

Advance to the next node automatically based on the last returned node.

Note: this uses the graph run's internal iteration which does NOT call node hooks (`before_node_run`, `wrap_node_run`, `after_node_run`, `on_node_run_error`). Use `next()` for capability-hooked iteration, or use `agent.run()` which drives via `next()` automatically.

###### Returns

`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\]

##### next

`@async`

```python
def next(
    node: _agent_graph.AgentNode[AgentDepsT, OutputDataT],
) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]
```

Manually drive the agent run by passing in the node you want to run next.

This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node.

Example:

```python
from pydantic_ai import Agent
from pydantic_graph import End

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent.iter('What is the capital of France?') as agent_run:
        next_node = agent_run.next_node  # start with the first node
        nodes = [next_node]
        while not isinstance(next_node, End):
            next_node = await agent_run.next(next_node)
            nodes.append(next_node)
        # Once `next_node` is an End, we've finished:
        print(nodes)
        '''
        [
            UserPromptNode(
                user_prompt='What is the capital of France?',
                instructions_functions=[],
                system_prompts=(),
                system_prompt_functions=[],
                system_prompt_dynamic_functions={},
            ),
            ModelRequestNode(
                request=ModelRequest(
                    parts=[
                        UserPromptPart(
                            content='What is the capital of France?',
                            timestamp=datetime.datetime(...),
                        )
                    ],
                    timestamp=datetime.datetime(...),
                    run_id='...',
                    conversation_id='...',
                )
            ),
            CallToolsNode(
                model_response=ModelResponse(
                    parts=[TextPart(content='The capital of France is Paris.')],
                    usage=RequestUsage(input_tokens=56, output_tokens=7),
                    model_name='gpt-5.2',
                    timestamp=datetime.datetime(...),
                    run_id='...',
                    conversation_id='...',
                )
            ),
            End(data=FinalResult(output='The capital of France is Paris.')),
        ]
        '''
        print('Final result:', agent_run.result.output)
        #> Final result: The capital of France is Paris.
```

###### Returns

`_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\] | `End`\[`FinalResult`\[`OutputDataT`\]\] -- the run has completed.

###### Parameters

**`node`** : `_agent_graph.AgentNode`\[`AgentDepsT`, `OutputDataT`\]

The node to run next in the graph.

##### usage

```python
def usage() -> _usage.RunUsage
```

Get usage statistics for the run so far, including token usage, model requests, and so on.

###### Returns

`_usage.RunUsage`

### AgentRunResult

**Bases:** `Generic[OutputDataT]`

The final result of an agent run.

#### Attributes

##### output

The output data from the agent run.

**Type:** `OutputDataT`

##### response

Return the last response from the message history.

**Type:** `_messages.ModelResponse`

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### run\_id

The unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### conversation\_id

The unique identifier for the conversation this run belongs to.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### all\_messages

```python
def all_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the history of \_messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### all\_messages\_json

```python
def all_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return all messages from [`all_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages

```python
def new_messages(
    output_tool_return_content: str | None = None,
) -> list[_messages.ModelMessage]
```

Return the messages produced during this run.

Messages provided via `message_history` and messages from older runs are excluded.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] -- List of new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### new\_messages\_json

```python
def new_messages_json(output_tool_return_content: str | None = None) -> bytes
```

Return new messages from [`new_messages`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) as JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- JSON bytes representing the new messages.

###### Parameters

**`output_tool_return_content`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified.

##### usage

```python
def usage() -> _usage.RunUsage
```

Return the usage of the whole run.

###### Returns

`_usage.RunUsage`

##### timestamp

```python
def timestamp() -> datetime
```

Return the timestamp of last response.

###### Returns

[`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime)

### AgentRunResultEvent

**Bases:** `Generic[OutputDataT]`

An event indicating the agent run ended and containing the final result of the agent run.

#### Attributes

##### result

The result of the run.

**Type:** [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)\[`OutputDataT`\]

##### event\_kind

Event type identifier, used as a discriminator.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['agent\_run\_result'\] **Default:** `'agent_run_result'`

---

# [pydantic_ai.settings](https://pydantic.dev/docs/ai/api/pydantic-ai/settings/)

# pydantic\_ai.settings

### ModelSettings

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Settings to configure an LLM.

Includes only settings which apply to multiple models / model providers, though not all of these settings are supported by all models.

All types must be serializable using Pydantic.

#### Attributes

##### max\_tokens

The maximum number of tokens to generate before stopping.

Supported by:

-   Gemini
-   Anthropic
-   OpenAI
-   Groq
-   Cohere
-   Mistral
-   Bedrock
-   MCP Sampling
-   Outlines (all providers)
-   xAI

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### temperature

Amount of randomness injected into the response.

Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks.

Note that even with `temperature` of `0.0`, the results will not be fully deterministic.

Supported by:

-   Gemini
-   Anthropic
-   OpenAI
-   Groq
-   Cohere
-   Mistral
-   Bedrock
-   Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline)
-   xAI

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### top\_p

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top\_p probability mass.

So 0.1 means only the tokens comprising the top 10% probability mass are considered.

You should either alter `temperature` or `top_p`, but not both.

Supported by:

-   Gemini
-   Anthropic
-   OpenAI
-   Groq
-   Cohere
-   Mistral
-   Bedrock
-   Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline)
-   xAI

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### timeout

Override the client-level default timeout for a request, in seconds.

Supported by:

-   Gemini
-   Anthropic
-   OpenAI
-   Groq
-   Mistral
-   xAI

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | `Timeout`

##### parallel\_tool\_calls

Whether to allow parallel tool calls.

Supported by:

-   OpenAI (some models, not o1)
-   Groq
-   Anthropic
-   xAI

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### tool\_choice

Control which function tools the model can use.

See the [Tool Choice guide](/docs/ai/tools-toolsets/tools-advanced#tool-choice) for detailed documentation and examples.

-   `None` (default): Defaults to `'auto'` behavior
-   `'auto'`: All tools available, model decides whether to use them
-   `'none'`: Disables function tools; model responds with text only (output tools remain for structured output)
-   `'required'`: Forces tool use; excludes output tools so the agent cannot produce a final response when set statically
-   `list[str]`: Only specified tools; excludes output tools so the agent cannot produce a final response when set statically
-   [`ToolOrOutput`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ToolOrOutput): Specified function tools plus output tools/text/image

Note: setting `'required'` or `list[str]` _statically_ (via the `model_settings` argument of `Agent.run` or the agent's own `model_settings`) raises a `UserError`, because it would force a tool call on every step and prevent the agent from producing a final response. To vary `tool_choice` per step (e.g. force a tool on the first step only), return a callable from a capability's [`get_model_settings`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings) -- those values are trusted to adapt across steps. For single API calls without an agent loop, use [`pydantic_ai.direct.model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request).

Supported by:

-   OpenAI
-   Anthropic (`'required'` and specific tools not supported with thinking enabled)
-   Google
-   Groq
-   Mistral
-   HuggingFace
-   Bedrock
-   xAI

**Type:** `ToolChoice`

##### seed

The random seed to use for the model, theoretically allowing for deterministic results.

Supported by:

-   OpenAI
-   Groq
-   Cohere
-   Mistral
-   Gemini
-   Outlines (LlamaCpp, VLLMOffline)

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### presence\_penalty

Penalize new tokens based on whether they have appeared in the text so far.

Supported by:

-   OpenAI
-   Groq
-   Cohere
-   Gemini
-   Mistral
-   Outlines (LlamaCpp, SgLang, VLLMOffline)
-   xAI

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### frequency\_penalty

Penalize new tokens based on their existing frequency in the text so far.

Supported by:

-   OpenAI
-   Groq
-   Cohere
-   Gemini
-   Mistral
-   Outlines (LlamaCpp, SgLang, VLLMOffline)
-   xAI

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### logit\_bias

Modify the likelihood of specified tokens appearing in the completion.

Supported by:

-   OpenAI
-   Groq
-   Outlines (Transformers, LlamaCpp, VLLMOffline)

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\]

##### stop\_sequences

Sequences that will cause the model to stop generating.

Supported by:

-   OpenAI
-   Anthropic
-   Bedrock
-   Mistral
-   Groq
-   Cohere
-   Google
-   xAI

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### extra\_headers

Extra headers to send to the model.

Supported by:

-   OpenAI
-   Anthropic
-   Gemini
-   Groq
-   xAI

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### thinking

Enable or configure thinking/reasoning for the model.

-   `True`: Enable thinking with the provider's default effort level.
-   `False`: Disable thinking (silently ignored if the model always thinks).
-   `'minimal'`/`'low'`/`'medium'`/`'high'`/`'xhigh'`: Enable thinking at a specific effort level.

When omitted, the model uses its default behavior (which may include thinking for reasoning models).

Provider-specific thinking settings (e.g., `anthropic_thinking`, `openai_reasoning_effort`) take precedence over this unified field.

Supported by:

-   Anthropic
-   OpenAI
-   Gemini
-   Groq
-   Bedrock
-   OpenRouter
-   Cerebras
-   xAI

**Type:** `ThinkingLevel`

##### service\_tier

The cross-provider service tier to use for the model request.

See `ServiceTier` for the value semantics and the per-provider mapping table. Provider-specific settings (`openai_service_tier`, `anthropic_service_tier`, `bedrock_service_tier`, `google_vertex_service_tier`) take precedence over this unified field when set.

Supported by:

-   OpenAI
-   Anthropic
-   Bedrock
-   Google (Gemini API and Vertex AI)

**Type:** `ServiceTier`

##### extra\_body

Extra body to send to the model.

Supported by:

-   OpenAI
-   Anthropic
-   Groq
-   Outlines (all providers)

**Type:** [`object`](https://docs.python.org/3/glossary.html#term-object)

### ToolOrOutput

Restricts function tools while keeping output tools and direct text/image output available.

Use this when you want to control which function tools the model can use in an agent run while still allowing the agent to complete with structured output, text, or images.

See the [Tool Choice guide](/docs/ai/tools-toolsets/tools-advanced#tool-choice) for examples.

#### Attributes

##### function\_tools

The names of function tools available to the model.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

---

# [pydantic_ai.tools](https://pydantic.dev/docs/ai/api/pydantic-ai/tools/)

# pydantic\_ai.tools

### RunContext

**Bases:** `Generic[RunContextAgentDepsT]`

Information about the current call.

#### Attributes

##### deps

Dependencies for the agent.

**Type:** `RunContextAgentDepsT`

##### model

The model used in this run.

**Type:** `Model`

##### usage

LLM usage associated with the run.

**Type:** [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage)

##### agent

The agent running this context, or `None` if not set.

**Type:** `AbstractAgent`\[`RunContextAgentDepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `field(default=None, repr=False)`

##### prompt

The original user prompt passed to the run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`_messages.UserContent`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### messages

Messages exchanged in the conversation so far.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`_messages.ModelMessage`\] **Default:** `field(default_factory=(list[_messages.ModelMessage]))`

##### validation\_context

Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) for tool args and run outputs.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None`

##### tracer

The tracer to use for tracing the run.

**Type:** `Tracer` **Default:** `field(default_factory=NoOpTracer)`

##### trace\_include\_content

Whether to include the content of the messages in the trace.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### instrumentation\_version

Instrumentation settings version, if instrumentation is enabled.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `DEFAULT_INSTRUMENTATION_VERSION`

##### retries

Number of retries for each tool so far.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `field(default_factory=(dict[str, int]))`

##### tool\_call\_id

The ID of the tool call.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_name

Name of the tool being called.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### retry

Number of retries so far.

For tool calls, this is the number of retries of the specific tool. For output validation, this is the number of output validation retries.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### max\_retries

The maximum number of retries allowed.

For tool calls, this is the maximum retries for the specific tool. For output validation, this is the maximum output validation retries.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### run\_step

The current step in the run.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### tool\_call\_approved

Whether a tool call that required approval has now been approved.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### tool\_call\_metadata

Metadata from `DeferredToolResults.metadata[tool_call_id]`, available when `tool_call_approved=True`.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) **Default:** `None`

##### partial\_output

Whether the output passed to an output validator is partial.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### run\_id

"Unique identifier for the agent run.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### conversation\_id

Unique identifier for the conversation this run belongs to.

A conversation spans potentially multiple agent runs that share message history. Resolved at the start of `Agent.run` (etc.) from the explicit `conversation_id` argument, the most recent `conversation_id` on `message_history`, or a fresh UUID7.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### metadata

Metadata associated with this agent run, if configured.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### model\_settings

The resolved model settings for the current run step.

Populated before each model request, after all model settings layers (model defaults, agent-level, capability, and run-level) have been merged. Available in model request hooks (`before_model_request`, `wrap_model_request`, `after_model_request`). Currently `None` in tool hooks, output validators, and during agent construction.

**Type:** [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### tool\_manager

The tool manager for the current run step.

Provides access to tool validation and execution, including tracing and capability hooks. Useful for toolsets that need to dispatch tool calls programmatically (e.g. code execution sandboxes).

Not available in `TemporalRunContext` -- it is not serializable across Temporal activity boundaries.

**Type:** `ToolManager`\[`RunContextAgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### last\_attempt

Whether this is the last attempt at running this tool before an error is raised.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

### DeferredToolRequests

Tool calls that require approval or external execution.

This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools.

Results can be passed to the next agent run using a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with the same tool call IDs.

See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information.

#### Attributes

##### calls

Tool calls that require external execution.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))`

##### approvals

Tool calls that require human-in-the-loop approval.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)\] **Default:** `field(default_factory=(list[ToolCallPart]))`

##### metadata

Metadata for deferred tool calls, keyed by `tool_call_id`.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))`

#### Methods

##### build\_results

```python
def build_results(
    approvals: dict[str, bool | DeferredToolApprovalResult] | None = None,
    calls: dict[str, DeferredToolCallResult | Any] | None = None,
    metadata: dict[str, dict[str, Any]] | None = None,
    approve_all: bool = False,
) -> DeferredToolResults
```

Create a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) for these requests.

###### Returns

[`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults)

###### Parameters

**`approvals`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`bool`](https://docs.python.org/3/library/functions.html#bool) | `DeferredToolApprovalResult`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Results for tool calls that required approval. Keys must match `tool_call_id`s in `self.approvals`.

**`calls`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `DeferredToolCallResult` | [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Results for tool calls that required external execution. Keys must match `tool_call_id`s in `self.calls`.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Per-call metadata, keyed by `tool_call_id`.

**`approve_all`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If `True`, every approval-requesting call not already listed in `approvals` is approved (with default `ToolApproved()`).

###### Raises

-   `ValueError` -- If a key in `approvals`/`calls` doesn't match a pending request of the appropriate kind.

##### remaining

```python
def remaining(results: DeferredToolResults) -> DeferredToolRequests | None
```

Return unresolved requests after applying results, or `None` if all resolved.

###### Returns

[`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) | [`None`](https://docs.python.org/3/library/constants.html#None)

### ToolApproved

Indicates that a tool call has been approved and that the tool function should be executed.

#### Attributes

##### override\_args

Optional tool call arguments to use instead of the original arguments.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### ToolDenied

Indicates that a tool call has been denied and that a denial message should be returned to the model.

#### Attributes

##### message

The message to return to the model.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `'The tool call was denied.'`

### DeferredToolResults

Results for deferred tool calls from a previous run that required approval or external execution.

The tool call IDs need to match those from the [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object from the previous run.

See [deferred tools docs](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more information.

#### Attributes

##### calls

Map of tool call IDs to results for tool calls that required external execution.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `DeferredToolCallResult` | [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] **Default:** `field(default_factory=(dict[str, DeferredToolCallResult | Any]))`

##### approvals

Map of tool call IDs to results for tool calls that required human-in-the-loop approval.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`bool`](https://docs.python.org/3/library/functions.html#bool) | `DeferredToolApprovalResult`\] **Default:** `field(default_factory=(dict[str, bool | DeferredToolApprovalResult]))`

##### metadata

Metadata for deferred tool calls, keyed by `tool_call_id`. Each value will be available in the tool's RunContext as `tool_call_metadata`.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `field(default_factory=(dict[str, dict[str, Any]]))`

#### Methods

##### update

```python
def update(other: DeferredToolResults) -> None
```

Update this `DeferredToolResults` with entries from another, in-place.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### to\_tool\_call\_results

```python
def to_tool_call_results() -> dict[str, DeferredToolResult]
```

Convert results into the internal per-call format used by the tool-execution pipeline.

Normalizes `True`/`False` approvals to `ToolApproved`/`ToolDenied`, and wraps plain external-call values in `ToolReturn`.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `DeferredToolResult`\]

### Tool

**Bases:** `Generic[ToolAgentDepsT]`

A tool function for an agent.

#### Attributes

##### function\_schema

The base JSON schema for the tool's parameters.

This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request.

**Type:** `_function_schema.FunctionSchema` **Default:** `function_schema or _function_schema.function_schema(function, schema_generator, tool_name=(self.name), takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions)`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    function: ToolFuncEither[ToolAgentDepsT, ToolParams],
    takes_ctx: bool | None = None,
    max_retries: int | None = None,
    name: str | None = None,
    description: str | None = None,
    prepare: ToolPrepareFunc[ToolAgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[ToolAgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat = 'auto',
    require_parameter_descriptions: bool = False,
    schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema,
    strict: bool | None = None,
    sequential: bool = False,
    requires_approval: bool = False,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    defer_loading: bool = False,
    include_return_schema: bool | None = None,
    function_schema: _function_schema.FunctionSchema | None = None,
)
```

Create a new tool instance.

Example usage:

```python
from pydantic_ai import Agent, RunContext, Tool

async def my_tool(ctx: RunContext[int], x: int, y: int) -> str:
    return f'{ctx.deps} {x} {y}'

agent = Agent('test', tools=[Tool(my_tool)])
```

or with a custom prepare method:

```python

from pydantic_ai import Agent, RunContext, Tool
from pydantic_ai.tools import ToolDefinition

async def my_tool(ctx: RunContext[int], x: int, y: int) -> str:
    return f'{ctx.deps} {x} {y}'

async def prep_my_tool(
    ctx: RunContext[int], tool_def: ToolDefinition
) -> ToolDefinition | None:
    # only register the tool if `deps == 42`
    if ctx.deps == 42:
        return tool_def

agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)])
```

###### Parameters

**`function`** : `ToolFuncEither`\[`ToolAgentDepsT`, `ToolParams`\]

The Python function to call as the tool.

**`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether the function takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) first argument, this is inferred if unset.

**`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Maximum number of retries allowed for this tool, set to the agent default if `None`.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Name of the tool, inferred from the function if `None`.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Description of the tool, inferred from the function if `None`.

**`prepare`** : `ToolPrepareFunc`\[`ToolAgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`ToolAgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` _Default:_ `'auto'`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If True, raise an error if a parameter description is missing. Defaults to False.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema`

The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the function requires a sequential/serial execution environment. Defaults to False.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout).

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to hide this tool until it's discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

**`function_schema`** : `_function_schema.FunctionSchema` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The function schema to use for the tool. If not provided, it will be generated.

##### from\_schema

`@classmethod`

```python
def from_schema(
    cls,
    function: Callable[..., Any],
    name: str,
    description: str | None,
    json_schema: JsonSchemaValue,
    takes_ctx: bool = False,
    sequential: bool = False,
    args_validator: ArgsValidatorFunc[Any, ...] | None = None,
) -> Self
```

Creates a Pydantic tool from a function and a JSON schema.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A Pydantic tool that calls the function

###### Parameters

**`function`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[..., [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The function to call. This will be called with keywords only. Schema validation of the arguments is skipped, but a custom `args_validator` will still run if provided.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The unique name of the tool that clearly communicates its purpose

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description.

**`json_schema`** : `JsonSchemaValue`

The schema for the function arguments

**`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

An optional boolean parameter indicating whether the function accepts the context object as an argument.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the function requires a sequential/serial execution environment. Defaults to False.

**`args_validator`** : `ArgsValidatorFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), ...\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

##### prepare\_tool\_def

`@async`

```python
def prepare_tool_def(ctx: RunContext[ToolAgentDepsT]) -> ToolDefinition | None
```

Get the tool definition.

By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set.

###### Returns

[`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) | [`None`](https://docs.python.org/3/library/constants.html#None) -- return a `ToolDefinition` or `None` if the tools should not be registered for this run.

### ToolDefinition

Definition of a tool passed to a model.

This is used for both function tools and output tools.

#### Attributes

##### name

The name of the tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### parameters\_json\_schema

The JSON schema for the tool's parameters.

**Type:** `ObjectJsonSchema` **Default:** `field(default_factory=(lambda: {'type': 'object', 'properties': {}}))`

##### description

The description of the tool.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### outer\_typed\_dict\_key

The key in the outer \[TypedDict\] that wraps an output tool.

This will only be set for output tools which don't have an `object` JSON schema.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### strict

Whether to enforce (vendor-specific) strict JSON schema validation for tool calls.

Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema.

When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters\_json\_schema.

Note: this is currently supported by OpenAI and Anthropic models.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### sequential

Whether this tool requires a sequential/serial execution environment.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### kind

The kind of tool:

-   `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model
-   `'output'`: a tool that passes through an output value that ends the run
-   `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more info.
-   `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info.

**Type:** `ToolKind` **Default:** `field(default='function')`

##### metadata

Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization.

For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### timeout

Timeout in seconds for tool execution.

If the tool takes longer than this, a retry prompt is returned to the model. Defaults to None (no timeout).

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### defer\_loading

Whether this tool should be hidden from the model until discovered via tool search.

See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### prefer\_builtin

If set, this function tool is a local fallback for the builtin tool with the given unique\_id.

When the model supports the corresponding builtin tool natively, this function tool is removed from the request. When the model does not support the builtin, the builtin is removed and this function tool stays.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### return\_schema

The JSON schema for the tool's return value.

For models that natively support return schemas (e.g. Google Gemini), this is passed as a structured field in the API request. For other models, it is injected into the tool's description as JSON text. Only included when `include_return_schema` resolves to `True`.

**Type:** `ObjectJsonSchema` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### include\_return\_schema

Whether to include the return schema in the tool definition sent to the model.

When `True`, the `return_schema` will be preserved and sent to the model. When `False`, the `return_schema` will be cleared before sending. When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### function\_signature

The function signature shape for this tool.

Lazily computed from `parameters_json_schema` and `return_schema` on first access. Name and description are not stored on the signature -- pass them at render time via `sig.render(body, name=td.name, description=td.description)`.

**Type:** `FunctionSignature`

##### defer

Whether calls to this tool will be deferred.

See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) for more info.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

#### Methods

##### render\_signature

```python
def render_signature(body: str, kwargs: Any = {}) -> str
```

Render the function signature with this tool's name and description.

Convenience wrapper around `self.function_signature.render()` that supplies `name` and `description` from this tool definition.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### matches\_tool\_selector

`@async`

```python
def matches_tool_selector(
    selector: ToolSelector[AgentDepsT],
    ctx: RunContext[AgentDepsT],
    tool_def: ToolDefinition,
) -> bool
```

Check whether a tool definition matches a [`ToolSelector`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolSelector).

#### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) -- `True` if the tool matches the selector.

#### Parameters

**`selector`** : `ToolSelector`\[`AgentDepsT`\]

The selector to check against.

**`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]

The current run context.

**`tool_def`** : [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)

The tool definition to test.

### AgentDepsT

Type variable for agent dependencies.

**Default:** `TypeVar('AgentDepsT', default=None, contravariant=True)`

### ToolParams

Retrieval function param spec.

**Default:** `ParamSpec('ToolParams', default=...)`

### SystemPromptFunc

A function that may or maybe not take `RunContext` as an argument, and may or may not be async.

Functions which return None are excluded from model requests.

Usage `SystemPromptFunc[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], str | None] | Callable[[RunContext[AgentDepsT]], Awaitable[str | None]] | Callable[[], str | None] | Callable[[], Awaitable[str | None]]`

### ToolFuncContext

A tool function that takes `RunContext` as the first argument.

Usage `ToolContextFunc[AgentDepsT, ToolParams]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[Concatenate[RunContext[AgentDepsT], ToolParams], Any]`

### ToolFuncPlain

A tool function that does not take `RunContext` as the first argument.

Usage `ToolPlainFunc[ToolParams]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[ToolParams, Any]`

### ToolFuncEither

Either kind of tool function.

This is just a union of [`ToolFuncContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolFuncContext) and [`ToolFuncPlain`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolFuncPlain).

Usage `ToolFuncEither[AgentDepsT, ToolParams]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `ToolFuncContext[AgentDepsT, ToolParams] | ToolFuncPlain[ToolParams]`

### ArgsValidatorFunc

A function that validates tool arguments before execution.

The validator receives the same typed parameters as the tool function, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument for dependency access.

Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[Concatenate[RunContext[AgentDepsT], ToolParams], Awaitable[None]] | Callable[Concatenate[RunContext[AgentDepsT], ToolParams], None]`

### ToolPrepareFunc

Definition of a function that can prepare a tool definition at call time. Both sync and async functions are accepted.

See [tool docs](/docs/ai/tools-toolsets/tools-advanced#tool-prepare) for more information.

Example -- here `only_if_42` is valid as a `ToolPrepareFunc`:

```python
from pydantic_ai import RunContext, Tool
from pydantic_ai.tools import ToolDefinition

def only_if_42(
    ctx: RunContext[int], tool_def: ToolDefinition
) -> ToolDefinition | None:
    if ctx.deps == 42:
        return tool_def

def hitchhiker(ctx: RunContext[int], answer: str) -> str:
    return f'{ctx.deps} {answer}'

hitchhiker = Tool(hitchhiker, prepare=only_if_42)
```

Usage `ToolPrepareFunc[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], 'ToolDefinition'], Union[Awaitable['ToolDefinition | None'], 'ToolDefinition', None]]`

### ToolsPrepareFunc

Definition of a function that can prepare the tool definition of all tools for each step. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. Both sync and async functions are accepted.

Example -- here `turn_on_strict_if_openai` is valid as a `ToolsPrepareFunc`:

```python
from dataclasses import replace

from pydantic_ai import Agent, RunContext
from pydantic_ai.tools import ToolDefinition


def turn_on_strict_if_openai(
    ctx: RunContext[None], tool_defs: list[ToolDefinition]
) -> list[ToolDefinition] | None:
    if ctx.model.system == 'openai':
        return [replace(tool_def, strict=True) for tool_def in tool_defs]
    return tool_defs

agent = Agent('openai:gpt-5.2', prepare_tools=turn_on_strict_if_openai)
```

Usage `ToolsPrepareFunc[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], list['ToolDefinition']], Awaitable['list[ToolDefinition] | None'] | list['ToolDefinition'] | None]`

### ToolSelectorFunc

A callable that decides whether a tool matches a selection criterion.

Receives the run context and a tool definition, returns `True` if the tool is selected. Both sync and async functions are accepted.

Usage `ToolSelectorFunc[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT], 'ToolDefinition'], bool | Awaitable[bool]]`

### ToolSelector

Specifies which tools a capability or toolset wrapper should apply to.

-   `'all'`: matches every tool (default for most capabilities).
-   `Sequence[str]`: matches tools whose names are in the sequence.
-   `dict[str, Any]`: matches tools whose [`metadata`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.metadata) contains all the specified key-value pairs (deep inclusion check -- nested dicts are compared recursively, and the tool's metadata may have additional keys).
-   `Callable[[RunContext, ToolDefinition], bool | Awaitable[bool]]`: custom sync or async predicate.

The first three forms are serializable for use in agent specs (YAML/JSON).

Usage `ToolSelector[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['all'] | Sequence[str] | dict[str, Any] | ToolSelectorFunc[AgentDepsT]`

### BuiltinToolFunc

Definition of a function that can prepare a builtin tool at call time.

This is useful if you want to customize the builtin tool based on the run context (e.g. user dependencies), or omit it completely from a step.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], Awaitable[AbstractBuiltinTool | None] | AbstractBuiltinTool | None]`

### AgentBuiltinTool

A builtin tool or a function that dynamically produces one.

This is a convenience alias for `AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]`

### DocstringFormat

Supported docstring formats.

-   `'google'` -- [Google-style](https://google.github.io/styleguide/pyguide.html#381-docstrings) docstrings.
-   `'numpy'` -- [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html) docstrings.
-   `'sphinx'` -- [Sphinx-style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format) docstrings.
-   `'auto'` -- Automatically infer the format based on the structure of the docstring.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['google', 'numpy', 'sphinx', 'auto']`

### DeferredToolApprovalResult

Result for a tool call that required human-in-the-loop approval.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Annotated[ToolApproved | ToolDenied, Discriminator('kind')]`

### DeferredToolCallResult

Result for a tool call that required external execution.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Annotated[Annotated[ToolReturn, Tag('tool-return')] | Annotated[ModelRetry, Tag('model-retry')] | Annotated[RetryPromptPart, Tag('retry-prompt')], Discriminator(_deferred_tool_call_result_discriminator)]`

### DeferredToolResult

Result for a tool call that required approval or external execution.

**Default:** `DeferredToolApprovalResult | DeferredToolCallResult`

### ToolAgentDepsT

Type variable for agent dependencies for a tool.

**Default:** `TypeVar('ToolAgentDepsT', default=object, contravariant=True)`

### ObjectJsonSchema

Type representing JSON schema of an object, e.g. where `"type": "object"`.

This type is used to define tools parameters (aka arguments) in [ToolDefinition](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition).

With PEP-728 this should be a TypedDict with `type: Literal['object']`, and `extra_parts=Any`

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `dict[str, Any]`

### ToolKind

Kind of tool.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Literal['function', 'output', 'external', 'unapproved']`

---

# [pydantic_ai.toolsets](https://pydantic.dev/docs/ai/api/pydantic-ai/toolsets/)

# pydantic\_ai.toolsets

### AbstractToolset

**Bases:** `ABC`, `Generic[AgentDepsT]`

A toolset is a collection of tools that can be used by an agent.

It is responsible for:

-   Listing the tools it contains
-   Validating the arguments of the tools
-   Calling the tools

See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information.

#### Attributes

##### id

An ID for the toolset that is unique among all toolsets registered with the same agent.

If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here.

A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### label

The name of the toolset for use in error messages.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### tool\_name\_conflict\_hint

A hint for how to avoid name conflicts with other toolsets for use in error messages.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### for\_run

`@async`

```python
def for_run(ctx: RunContext[AgentDepsT]) -> AbstractToolset[AgentDepsT]
```

Return the toolset to use for this agent run.

Called once per run, before `__aenter__`. Override this to return a fresh instance for per-run state isolation. Default: return `self` (shared across runs).

###### Returns

[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]

##### for\_run\_step

`@async`

```python
def for_run_step(ctx: RunContext[AgentDepsT]) -> AbstractToolset[AgentDepsT]
```

Return the toolset to use for this run step.

Called at the start of each run step. Override this to return a modified instance for per-step state transitions. If returning a new instance, you are responsible for managing any lifecycle transitions (exiting old inner toolsets, entering new ones). Default: return `self` (no per-step changes).

###### Returns

[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]

##### \_\_aenter\_\_

`@async`

```python
def __aenter__() -> Self
```

Enter the toolset context.

This is where you can set up network connections in a concrete implementation.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

##### \_\_aexit\_\_

`@async`

```python
def __aexit__(args: Any = ()) -> bool | None
```

Exit the toolset context.

This is where you can tear down network connections in a concrete implementation.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_instructions

`@async`

```python
def get_instructions(
    ctx: RunContext[AgentDepsT],
) -> str | InstructionPart | Sequence[str | InstructionPart] | None
```

Return instructions for how to use this toolset's tools.

Override this method to provide instructions that help the agent understand how to use the tools in this toolset effectively.

Simple implementations can return a plain `str`; advanced implementations can return [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) objects to indicate whether each instruction block is static or dynamic for caching purposes.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- Instruction string, `InstructionPart`, list of either, or `None`. [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- Plain `str` values are treated as dynamic instructions by default.

###### Parameters

**`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]

The run context for this agent run.

##### get\_tools

`@abstractmethod`

`@async`

```python
def get_tools(ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]
```

The tools that are available in this toolset.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), `ToolsetTool`\[`AgentDepsT`\]\]

##### call\_tool

`@abstractmethod`

`@async`

```python
def call_tool(
    name: str,
    tool_args: dict[str, Any],
    ctx: RunContext[AgentDepsT],
    tool: ToolsetTool[AgentDepsT],
) -> Any
```

Call a tool with the given arguments.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the tool to call.

**`tool_args`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The arguments to pass to the tool.

**`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]

The run context.

**`tool`** : `ToolsetTool`\[`AgentDepsT`\]

The tool definition returned by [`get_tools`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_tools) that was called.

##### apply

```python
def apply(visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None
```

Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### visit\_and\_replace

```python
def visit_and_replace(
    visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]],
) -> AbstractToolset[AgentDepsT]
```

Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.

###### Returns

[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]

##### filtered

```python
def filtered(
    filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool | Awaitable[bool]],
) -> FilteredToolset[AgentDepsT]
```

Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#filtering-tools) for more information.

###### Returns

[`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset)\[`AgentDepsT`\]

##### prefixed

```python
def prefixed(prefix: str) -> PrefixedToolset[AgentDepsT]
```

Returns a new toolset that prefixes the names of this toolset's tools.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#prefixing-tool-names) for more information.

###### Returns

[`PrefixedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PrefixedToolset)\[`AgentDepsT`\]

##### prepared

```python
def prepared(prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]
```

Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#preparing-tool-definitions) for more information.

###### Returns

[`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset)\[`AgentDepsT`\]

##### renamed

```python
def renamed(name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]
```

Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#renaming-tools) for more information.

###### Returns

[`RenamedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.RenamedToolset)\[`AgentDepsT`\]

##### approval\_required

```python
def approval_required(
    approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = lambda ctx, tool_def, tool_args: True,
) -> ApprovalRequiredToolset[AgentDepsT]
```

Returns a new toolset that requires (some) calls to tools it contains to be approved.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval) for more information.

###### Returns

[`ApprovalRequiredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ApprovalRequiredToolset)\[`AgentDepsT`\]

##### defer\_loading

```python
def defer_loading(
    tool_names: Sequence[str] | None = None,
) -> DeferredLoadingToolset[AgentDepsT]
```

Returns a new toolset that marks tools for deferred loading, hiding them until discovered via tool search.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more information.

###### Returns

[`DeferredLoadingToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.DeferredLoadingToolset)\[`AgentDepsT`\]

###### Parameters

**`tool_names`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional sequence of tool names to mark for deferred loading. If `None`, all tools are marked for deferred loading.

##### include\_return\_schemas

```python
def include_return_schemas() -> IncludeReturnSchemasToolset[AgentDepsT]
```

Returns a new toolset that sets `include_return_schema=True` on all tools.

This causes the model to receive return type information for the tools in this toolset. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field. For other models, it is injected into the tool description as JSON text.

This is the toolset-level equivalent of the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability, which can be used to enable return schemas across all toolsets or a subset matched by a [`ToolSelector`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolSelector).

###### Returns

[`IncludeReturnSchemasToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.IncludeReturnSchemasToolset)\[`AgentDepsT`\]

##### with\_metadata

```python
def with_metadata(metadata: Any = {}) -> SetMetadataToolset[AgentDepsT]
```

Returns a new toolset that merges the given metadata onto all tools.

###### Returns

[`SetMetadataToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.SetMetadataToolset)\[`AgentDepsT`\]

### CombinedToolset

**Bases:** `AbstractToolset[AgentDepsT]`

A toolset that combines multiple toolsets.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#combining-toolsets) for more information.

### ExternalToolset

**Bases:** `AbstractToolset[AgentDepsT]`

A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#external-toolset) for more information.

### ApprovalRequiredToolset

**Bases:** `WrapperToolset[AgentDepsT]`

A toolset that requires (some) calls to tools it contains to be approved.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval) for more information.

### FilteredToolset

**Bases:** `WrapperToolset[AgentDepsT]`

A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition.

Both sync and async filter functions are accepted.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#filtering-tools) for more information.

### FunctionToolset

**Bases:** `AbstractToolset[AgentDepsT]`

A toolset that lets Python functions be used as tools.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#function-toolset) for more information.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [],
    max_retries: int | None = None,
    timeout: float | None = None,
    docstring_format: DocstringFormat = 'auto',
    require_parameter_descriptions: bool = False,
    schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema,
    strict: bool | None = None,
    sequential: bool = False,
    requires_approval: bool = False,
    metadata: dict[str, Any] | None = None,
    defer_loading: bool = False,
    include_return_schema: bool | None = None,
    id: str | None = None,
    instructions: str | SystemPromptFunc[AgentDepsT] | Sequence[str | SystemPromptFunc[AgentDepsT]] | None = None,
)
```

Build a new function toolset.

###### Parameters

**`tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\] | `ToolFuncEither`\[`AgentDepsT`, ...\]\] _Default:_ `[]`

The tools to add to the toolset.

**`max_retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The maximum number of retries for each tool during a run. If `None`, inherits the agent's default retry count at runtime. Applies to all tools, unless overridden when adding a tool.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If a tool takes longer than this, a retry prompt is returned to the model. Individual tools can override this with their own timeout. Defaults to None (no timeout).

**`docstring_format`** : `DocstringFormat` _Default:_ `'auto'`

Format of tool docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] _Default:_ `GenerateToolJsonSchema`

The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata.

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to hide tools from the model until discovered via tool search. Defaults to False. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. Applies to all tools, unless overridden when adding a tool.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include return schemas in tool definitions sent to the model. If `None`, defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used. Applies to all tools, unless overridden when adding a tool.

**`id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow.

**`instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | `SystemPromptFunc`\[`AgentDepsT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | `SystemPromptFunc`\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Instructions for this toolset that are automatically included in the model request. Can be a string, a function (sync or async, with or without `RunContext`), or a sequence of these.

##### tool

```python
def tool(
    func: ToolFuncContext[AgentDepsT, ToolParams],
) -> ToolFuncContext[AgentDepsT, ToolParams]
def tool(
    name: str | None = None,
    description: str | None = None,
    retries: int | None = None,
    prepare: ToolPrepareFunc[AgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat | None = None,
    require_parameter_descriptions: bool | None = None,
    schema_generator: type[GenerateJsonSchema] | None = None,
    strict: bool | None = None,
    sequential: bool | None = None,
    requires_approval: bool | None = None,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    defer_loading: bool | None = None,
    include_return_schema: bool | None = None,
) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]]
```

Decorator to register a tool function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument.

Can decorate a sync or async functions.

The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema).

We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured.

Example:

```python
from pydantic_ai import Agent, FunctionToolset, RunContext

toolset = FunctionToolset()

@toolset.tool
def foobar(ctx: RunContext[int], x: int) -> int:
    return ctx.deps + x

@toolset.tool(retries=2)
async def spam(ctx: RunContext[str], y: float) -> float:
    return ctx.deps + y

agent = Agent('test', toolsets=[toolset], deps_type=int)
result = agent.run_sync('foobar', deps=1)
print(result.output)
#> {"foobar":1,"spam":1.0}
```

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`func`** : `ToolFuncContext`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The tool function to register.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the tool, defaults to the function name.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The description of the tool,defaults to the function docstring.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1.

**`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout).

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset.

##### tool\_plain

```python
def tool_plain(func: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]
def tool_plain(
    name: str | None = None,
    description: str | None = None,
    retries: int | None = None,
    prepare: ToolPrepareFunc[AgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat | None = None,
    require_parameter_descriptions: bool | None = None,
    schema_generator: type[GenerateJsonSchema] | None = None,
    strict: bool | None = None,
    sequential: bool | None = None,
    requires_approval: bool | None = None,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    defer_loading: bool | None = None,
    include_return_schema: bool | None = None,
) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]]
```

Decorator to register a tool function which DOES NOT take `RunContext` as an argument.

Can decorate a sync or async functions.

The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema).

We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool_plain` is obscured.

Example:

```python
from pydantic_ai import Agent, FunctionToolset

toolset = FunctionToolset()

@toolset.tool_plain
def foobar(x: int) -> int:
    return x + 1

@toolset.tool_plain(retries=2)
async def spam(y: float) -> float:
    return y * 2.0

agent = Agent('test', toolsets=[toolset])
result = agent.run_sync('foobar')
print(result.output)
#> {"foobar":1,"spam":0.0}
```

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

###### Parameters

**`func`** : `ToolFuncPlain`\[`ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The tool function to register.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the tool, defaults to the function name.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The description of the tool, defaults to the function docstring.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The number of retries to allow for this tool, defaults to the toolset's default retries, which defaults to the agent's default.

**`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument -- even though the tool function itself does not take `RunContext` when using `tool_plain`. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout).

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset.

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset.

##### instructions

```python
def instructions(func: SystemPromptFunc[AgentDepsT]) -> SystemPromptFunc[AgentDepsT]
```

Decorator to register an instructions function for this toolset.

The function can be sync or async, and can optionally take a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument.

Example:

```python
from pydantic_ai import FunctionToolset, RunContext

toolset = FunctionToolset[int]()

@toolset.instructions
def my_instructions(ctx: RunContext[int]) -> str:
    return 'Always use the search tool when looking for information.'

@toolset.tool
def search(ctx: RunContext[int], query: str) -> str:
    return f'Results for: {query}'
```

###### Returns

`SystemPromptFunc`\[`AgentDepsT`\]

###### Parameters

**`func`** : `SystemPromptFunc`\[`AgentDepsT`\]

The instructions function to register.

##### add\_function

```python
def add_function(
    func: ToolFuncEither[AgentDepsT, ToolParams],
    takes_ctx: bool | None = None,
    name: str | None = None,
    description: str | None = None,
    retries: int | None = None,
    prepare: ToolPrepareFunc[AgentDepsT] | None = None,
    args_validator: ArgsValidatorFunc[AgentDepsT, ToolParams] | None = None,
    docstring_format: DocstringFormat | None = None,
    require_parameter_descriptions: bool | None = None,
    schema_generator: type[GenerateJsonSchema] | None = None,
    strict: bool | None = None,
    sequential: bool | None = None,
    requires_approval: bool | None = None,
    defer_loading: bool | None = None,
    metadata: dict[str, Any] | None = None,
    timeout: float | None = None,
    include_return_schema: bool | None = None,
) -> Tool[AgentDepsT]
```

Add a function as a tool to the toolset.

Can take a sync or async function.

The docstring is inspected to extract both the tool description and description of each parameter, [learn more](/docs/ai/tools-toolsets/tools#function-tools-and-schema).

###### Returns

[`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\]

###### Parameters

**`func`** : `ToolFuncEither`\[`AgentDepsT`, `ToolParams`\]

The tool function to register.

**`takes_ctx`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether the function takes a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument. If `None`, this is inferred from the function signature.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the tool, defaults to the function name.

**`description`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The description of the tool, defaults to the function docstring.

**`retries`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1.

**`prepare`** : `ToolPrepareFunc`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc).

**`args_validator`** : `ArgsValidatorFunc`\[`AgentDepsT`, `ToolParams`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

custom method to validate tool arguments after schema validation has passed, before execution. The validator receives the already-validated and type-converted parameters, with `RunContext` as the first argument. Should raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on validation failure, return `None` on success. See [`ArgsValidatorFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ArgsValidatorFunc).

**`docstring_format`** : `DocstringFormat` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The format of the docstring, see [`DocstringFormat`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). If `None`, the default value is determined by the toolset.

**`require_parameter_descriptions`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset.

**`schema_generator`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`GenerateJsonSchema`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset.

**`strict`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) for more info. If `None`, the default value is determined by the toolset.

**`sequential`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset.

**`requires_approval`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset.

**`defer_loading`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to hide this tool until it's discovered via tool search. See [Tool Search](/docs/ai/tools-toolsets/tools-advanced#tool-search) for more info. If `None`, the default value is determined by the toolset.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata.

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout).

**`include_return_schema`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Whether to include the return schema in the tool definition sent to the model. If `None`, the default value is determined by the toolset.

##### add\_tool

```python
def add_tool(tool: Tool[AgentDepsT]) -> None
```

Add a tool to the toolset.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`tool`** : [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)\[`AgentDepsT`\]

The tool to add.

### IncludeReturnSchemasToolset

**Bases:** `PreparedToolset[AgentDepsT]`

A toolset that sets `include_return_schema=True` on all its tools.

See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information.

### DeferredLoadingToolset

**Bases:** `PreparedToolset[AgentDepsT]`

A toolset that marks tools for deferred loading, hiding them from the model until discovered via tool search.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more information.

#### Attributes

##### tool\_names

Optional set of tool names to mark for deferred loading. If `None`, all tools are marked for deferred loading.

**Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_names`

### PrefixedToolset

**Bases:** `WrapperToolset[AgentDepsT]`

A toolset that prefixes the names of the tools it contains.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#prefixing-tool-names) for more information.

### RenamedToolset

**Bases:** `WrapperToolset[AgentDepsT]`

A toolset that renames the tools it contains using a dictionary mapping new names to original names.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#renaming-tools) for more information.

### SetMetadataToolset

**Bases:** `PreparedToolset[AgentDepsT]`

A toolset that merges metadata key-value pairs onto all its tools.

See [toolset docs](/docs/ai/tools-toolsets/toolsets) for more information.

### PreparedToolset

**Bases:** `WrapperToolset[AgentDepsT]`

A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#preparing-tool-definitions) for more information.

### WrapperToolset

**Bases:** `AbstractToolset[AgentDepsT]`

A toolset that wraps another toolset and delegates to it.

See [toolset docs](/docs/ai/tools-toolsets/toolsets#wrapping-a-toolset) for more information.

#### Methods

##### get\_instructions

`@async`

```python
def get_instructions(
    ctx: RunContext[AgentDepsT],
) -> str | InstructionPart | Sequence[str | InstructionPart] | None
```

Delegate instructions to the wrapped toolset.

This explicit delegation ensures type safety and proper propagation of the instructions from wrapped toolsets to the agent's system prompt.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

### ToolsetFunc

A sync/async function which takes a run context and returns a toolset.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[RunContext[AgentDepsT]], AbstractToolset[AgentDepsT] | None | Awaitable[AbstractToolset[AgentDepsT] | None]]`

### FastMCPToolset

**Bases:** `AbstractToolset[AgentDepsT]`

A FastMCP Toolset that uses the FastMCP Client to call tools from a local or remote MCP Server.

The Toolset can accept a FastMCP Client, a FastMCP Transport, or any other object which a FastMCP Transport can be created from.

See [https://gofastmcp.com/clients/transports](https://gofastmcp.com/clients/transports) for a full list of transports available.

#### Attributes

##### client

The FastMCP client to use.

**Type:** `Client`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### max\_retries

The maximum number of retries to attempt if a tool call fails.

If `None`, inherits the agent's default retry count at runtime.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `max_retries`

##### tool\_error\_behavior

The behavior to take when a tool error occurs.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['model\_retry', 'error'\] **Default:** `tool_error_behavior`

##### include\_instructions

Whether to include the server's instructions in the agent's instructions.

Defaults to `False` for backward compatibility.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `include_instructions`

##### include\_return\_schema

Whether to include return schemas in tool definitions sent to the model.

When `None` (default), defaults to `False` unless the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability is used.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `include_return_schema`

##### process\_tool\_call

Hook to customize tool calling and optionally pass extra metadata.

**Type:** `ProcessToolCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `process_tool_call`

##### instructions

Access the instructions sent by the FastMCP server during initialization.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### get\_instructions

`@async`

```python
def get_instructions(ctx: RunContext[AgentDepsT]) -> messages.InstructionPart | None
```

Return the FastMCP server's instructions for how to use its tools.

If [`include_instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset.include_instructions) is `True`, returns the [`instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset.instructions) sent by the FastMCP server during initialization. Otherwise, returns `None`.

Instructions from external servers are marked as dynamic since they may change between connections.

###### Returns

[`messages.InstructionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.InstructionPart) | [`None`](https://docs.python.org/3/library/constants.html#None) -- An `InstructionPart` with the server's instructions if `include_instructions` is enabled, otherwise `None`.

###### Parameters

**`ctx`** : [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)\[`AgentDepsT`\]

The run context for this agent run.

##### direct\_call\_tool

`@async`

```python
def direct_call_tool(
    name: str,
    args: dict[str, Any],
    metadata: dict[str, Any] | None = None,
) -> Any
```

Call a tool on the server.

###### Returns

[`Any`](https://docs.python.org/3/library/typing.html#typing.Any) -- The result of the tool call.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the tool to call.

**`args`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The arguments to pass to the tool.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Request-level metadata (optional)

###### Raises

-   `ModelRetry` -- If the tool call fails.

---

# [pydantic_ai.usage](https://pydantic.dev/docs/ai/api/pydantic-ai/usage/)

# pydantic\_ai.usage

### UsageBase

#### Attributes

##### input\_tokens

Number of input/prompt tokens.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`int`](https://docs.python.org/3/library/functions.html#int), `Field`(`validation_alias`\=(`AliasChoices`(`input_tokens`, `request_tokens`)))\] **Default:** `0`

##### cache\_write\_tokens

Number of tokens written to the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### cache\_read\_tokens

Number of tokens read from the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### output\_tokens

Number of output/completion tokens.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`int`](https://docs.python.org/3/library/functions.html#int), `Field`(`validation_alias`\=(`AliasChoices`(`output_tokens`, `response_tokens`)))\] **Default:** `0`

##### input\_audio\_tokens

Number of audio input tokens.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### cache\_audio\_read\_tokens

Number of audio tokens read from the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### output\_audio\_tokens

Number of audio output tokens.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### details

Any extra details returned by the model.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\], `BeforeValidator`([`lambda`](https://docs.python.org/3/glossary.html#term-lambda) `d`: `d` [`or`](https://docs.python.org/3/reference/expressions.html#or) {})\] **Default:** `dataclasses.field(default_factory=(dict[str, int]))`

##### total\_tokens

Sum of `input_tokens + output_tokens`.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

#### Methods

##### \_\_copy\_\_

```python
def __copy__() -> UsageBase
```

Shallow copy that also copies mutable fields like `details`.

###### Returns

`UsageBase`

##### opentelemetry\_attributes

```python
def opentelemetry_attributes() -> dict[str, int]
```

Get the token usage values as OpenTelemetry attributes.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\]

##### has\_values

```python
def has_values() -> bool
```

Whether any values are set and non-zero.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### RequestUsage

**Bases:** `UsageBase`

LLM usage associated with a single request.

This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices).

#### Methods

##### incr

```python
def incr(incr_usage: RequestUsage) -> None
```

Increment the usage in place.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`incr_usage`** : [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

The usage to increment by.

##### \_\_add\_\_

```python
def __add__(other: RequestUsage) -> RequestUsage
```

Add two RequestUsages together.

This is provided so it's trivial to sum usage information from multiple parts of a response.

**WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

##### extract

`@classmethod`

```python
def extract(
    cls,
    data: Any,
    provider: str,
    provider_url: str,
    provider_fallback: str,
    api_flavor: str = 'default',
    details: dict[str, Any] | None = None,
) -> RequestUsage
```

Extract usage information from the response data using genai-prices.

###### Returns

[`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

###### Parameters

**`data`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

The response data from the model API.

**`provider`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The actual provider ID

**`provider_url`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The provider base\_url

**`provider_fallback`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID.

**`api_flavor`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `'default'`

The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI.

**`details`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Becomes the `details` field on the returned `RequestUsage` for convenience.

### RunUsage

**Bases:** `UsageBase`

LLM usage associated with an agent run.

Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests.

#### Attributes

##### requests

Number of requests made to the LLM API.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### tool\_calls

Number of successful tool calls executed during the run.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### input\_tokens

Total number of input/prompt tokens.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### cache\_write\_tokens

Total number of tokens written to the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### cache\_read\_tokens

Total number of tokens read from the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### input\_audio\_tokens

Total number of audio input tokens.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### cache\_audio\_read\_tokens

Total number of audio tokens read from the cache.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### output\_tokens

Total number of output/completion tokens.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `0`

##### details

Any extra details returned by the model.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int)\] **Default:** `dataclasses.field(default_factory=(dict[str, int]))`

#### Methods

##### incr

```python
def incr(incr_usage: RunUsage | RequestUsage) -> None
```

Increment the usage in place.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`incr_usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage)

The usage to increment by.

##### \_\_add\_\_

```python
def __add__(other: RunUsage | RequestUsage) -> RunUsage
```

Add two RunUsages together.

This is provided so it's trivial to sum usage information from multiple runs.

###### Returns

[`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage)

### Usage

**Bases:** [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage)

Deprecated alias for `RunUsage`.

### UsageLimits

Limits on model usage.

The request count is tracked by pydantic\_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response.

Each of the limits can be set to `None` to disable that limit.

#### Attributes

##### request\_limit

The maximum number of requests allowed to the model.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `request_limit`

##### tool\_calls\_limit

The maximum number of successful tool calls allowed to be executed.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `tool_calls_limit`

##### input\_tokens\_limit

The maximum number of input/prompt tokens allowed.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `input_tokens_limit if input_tokens_limit is not None else request_tokens_limit`

##### output\_tokens\_limit

The maximum number of output/response tokens allowed.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `output_tokens_limit if output_tokens_limit is not None else response_tokens_limit`

##### total\_tokens\_limit

The maximum number of tokens allowed in requests and responses combined.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `total_tokens_limit`

##### count\_tokens\_before\_request

If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time.

This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default.

Supported by:

-   Anthropic
-   Google
-   Bedrock Converse

Support for OpenAI is in development: [https://github.com/pydantic/pydantic-ai/issues/3430](https://github.com/pydantic/pydantic-ai/issues/3430)

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `count_tokens_before_request`

#### Methods

##### has\_token\_limits

```python
def has_token_limits() -> bool
```

Returns `True` if this instance places any limits on token counts.

If this returns `False`, the `check_tokens` method will never raise an error.

This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### check\_before\_request

```python
def check_before_request(usage: RunUsage) -> None
```

Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### check\_tokens

```python
def check_tokens(usage: RunUsage) -> None
```

Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### check\_before\_tool\_call

```python
def check_before_tool_call(projected_usage: RunUsage) -> None
```

Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

---

# [pydantic_evals.dataset](https://pydantic.dev/docs/ai/api/pydantic_evals/dataset/)

# pydantic\_evals.dataset

Dataset management for pydantic evals.

This module provides functionality for creating, loading, saving, and evaluating datasets of test cases. Each case must have inputs, and can optionally have a name, expected output, metadata, and case-specific evaluators.

Datasets can be loaded from and saved to YAML or JSON files, and can be evaluated against a task function to produce an evaluation report.

### Case

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

A single row of a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset).

Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata.

Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators.

Example:

```python
from pydantic_evals import Case

case = Case(
    name='Simple addition',
    inputs={'a': 1, 'b': 2},
    expected_output=3,
    metadata={'description': 'Tests basic addition'},
)
```

#### Attributes

##### name

Name of the case. This is used to identify the case in the report and can be used to filter cases.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### inputs

Inputs to the task. This is the input to the task that will be evaluated.

**Type:** `InputsT` **Default:** `inputs`

##### metadata

Metadata to be used in the evaluation.

This can be used to provide additional information about the case to the evaluators.

**Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `metadata`

##### expected\_output

Expected output of the task. This is the expected output of the task that will be evaluated.

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `expected_output`

##### evaluators

Evaluators to be used just on this case.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `list(evaluators)`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    name: str | None = None,
    inputs: InputsT,
    metadata: MetadataT | None = None,
    expected_output: OutputT | None = None,
    evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (),
)
```

Initialize a new test case.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for the case. If not provided, a generic name will be assigned when added to a dataset.

**`inputs`** : `InputsT`

The inputs to the task being evaluated.

**`metadata`** : `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the case, which can be used by evaluators.

**`expected_output`** : `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional expected output of the task, used for comparison in evaluators.

**`evaluators`** : [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\], ...\] _Default:_ `()`

Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators.

### Dataset

**Bases:** `BaseModel`, `Generic[InputsT, OutputT, MetadataT]`

A dataset of test [cases](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case).

Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases.

Example:

```python
# Create a dataset with two test cases
from dataclasses import dataclass

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.output == ctx.expected_output

dataset = Dataset(
    name='uppercase_tests',
    cases=[
        Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'),
        Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'),
    ],
    evaluators=[ExactMatch()],
)

# Evaluate the dataset against a task function
async def uppercase(inputs: dict) -> str:
    return inputs['text'].upper()

async def main():
    report = await dataset.evaluate(uppercase)
    report.print()
'''
   Evaluation Summary: uppercase
┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID  ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ test1    │ ✔          │     10ms │
├──────────┼────────────┼──────────┤
│ test2    │ ✔          │     10ms │
├──────────┼────────────┼──────────┤
│ Averages │ 100.0% ✔   │     10ms │
└──────────┴────────────┴──────────┘
'''
```

#### Attributes

##### name

Name of the dataset. Required in future versions.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### cases

List of test cases in the dataset.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Case`\[`InputsT`, `OutputT`, `MetadataT`\]\]

##### evaluators

List of evaluators to be used on all cases in the dataset.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `[]`

##### report\_evaluators

Evaluators that operate on the full report to produce experiment-wide analyses.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `[]`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    name: str | None = None,
    cases: Sequence[Case[InputsT, OutputT, MetadataT]],
    evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (),
    report_evaluators: Sequence[ReportEvaluator[InputsT, OutputT, MetadataT]] = (),
)
```

Initialize a new dataset with test cases and optional evaluators.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Name for the dataset. Omitting this is deprecated and will raise an error in a future version.

**`cases`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Case`\[`InputsT`, `OutputT`, `MetadataT`\]\]

Sequence of test cases to include in the dataset.

**`evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] _Default:_ `()`

Optional sequence of evaluators to apply to all cases in the dataset.

**`report_evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\] _Default:_ `()`

Optional sequence of report evaluators that run on the full evaluation report.

##### evaluate

`@async`

```python
def evaluate(
    task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT],
    name: str | None = None,
    max_concurrency: int | None = None,
    progress: bool = True,
    retry_task: RetryConfig | None = None,
    retry_evaluators: RetryConfig | None = None,
    task_name: str | None = None,
    metadata: dict[str, Any] | None = None,
    repeat: int = 1,
    lifecycle: type[CaseLifecycle[InputsT, OutputT, MetadataT]] | None = None,
) -> EvaluationReport[InputsT, OutputT, MetadataT]
```

Evaluates the test cases in the dataset using the given task.

This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified.

###### Returns

`EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\] -- A report containing the results of the evaluation.

###### Parameters

**`task`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], `OutputT`\]

The task to evaluate. This should be a callable that takes the inputs of the case and returns the output.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task\_name will be used; if that is not specified, the name of the task function is used.

**`max_concurrency`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently.

**`progress`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to show a progress bar for the evaluation. Defaults to `True`.

**`retry_task`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional retry configuration for the task execution.

**`retry_evaluators`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional retry configuration for evaluator execution.

**`task_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional override to the name of the task being executed, otherwise the name of the task function will be used.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional dict of experiment metadata.

**`repeat`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1`

Number of times to run each case. When > 1, each case is run multiple times and results are grouped by the original case name for aggregation. Defaults to 1.

**`lifecycle`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`CaseLifecycle`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional lifecycle class for per-case setup, context preparation, and teardown hooks. A new instance is created for each case. See [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle).

##### evaluate\_sync

```python
def evaluate_sync(
    task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT],
    name: str | None = None,
    max_concurrency: int | None = None,
    progress: bool = True,
    retry_task: RetryConfig | None = None,
    retry_evaluators: RetryConfig | None = None,
    task_name: str | None = None,
    metadata: dict[str, Any] | None = None,
    repeat: int = 1,
    lifecycle: type[CaseLifecycle[InputsT, OutputT, MetadataT]] | None = None,
) -> EvaluationReport[InputsT, OutputT, MetadataT]
```

Evaluates the test cases in the dataset using the given task.

This is a synchronous wrapper around [`evaluate`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) provided for convenience.

###### Returns

`EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\] -- A report containing the results of the evaluation.

###### Parameters

**`task`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`InputsT`\], `OutputT`\]

The task to evaluate. This should be a callable that takes the inputs of the case and returns the output.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task\_name will be used; if that is not specified, the name of the task function is used.

**`max_concurrency`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently.

**`progress`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to show a progress bar for the evaluation. Defaults to `True`.

**`retry_task`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional retry configuration for the task execution.

**`retry_evaluators`** : `RetryConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional retry configuration for evaluator execution.

**`task_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional override to the name of the task being executed, otherwise the name of the task function will be used.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional dict of experiment metadata.

**`repeat`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `1`

Number of times to run each case. When > 1, each case is run multiple times and results are grouped by the original case name for aggregation. Defaults to 1.

**`lifecycle`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`CaseLifecycle`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional lifecycle class for per-case setup, context preparation, and teardown hooks. A new instance is created for each case. See [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle).

##### add\_case

```python
def add_case(
    name: str | None = None,
    inputs: InputsT,
    metadata: MetadataT | None = None,
    expected_output: OutputT | None = None,
    evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (),
) -> None
```

Adds a case to the dataset.

This is a convenience method for creating a [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) and adding it to the dataset.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for the case. If not provided, a generic name will be assigned.

**`inputs`** : `InputsT`

The inputs to the task being evaluated.

**`metadata`** : `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata for the case, which can be used by evaluators.

**`expected_output`** : `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The expected output of the task, used for comparison in evaluators.

**`evaluators`** : [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\], ...\] _Default:_ `()`

Tuple of evaluators specific to this case, in addition to dataset-level evaluators.

##### add\_evaluator

```python
def add_evaluator(
    evaluator: Evaluator[InputsT, OutputT, MetadataT],
    specific_case: str | None = None,
) -> None
```

Adds an evaluator to the dataset or a specific case.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`evaluator`** : `Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]

The evaluator to add.

**`specific_case`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset.

###### Raises

-   `ValueError` -- If `specific_case` is provided but no case with that name exists in the dataset.

##### from\_file

`@classmethod`

```python
def from_file(
    cls,
    path: Path | str,
    fmt: Literal['yaml', 'json'] | None = None,
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (),
) -> Self
```

Load a dataset from a file.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance loaded from the file.

###### Parameters

**`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Path to the file to load.

**`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'.

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones.

**`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones.

###### Raises

-   `ValidationError` -- If the file cannot be parsed as a valid dataset.
-   `ValueError` -- If the format cannot be inferred from the file extension.

##### from\_text

`@classmethod`

```python
def from_text(
    cls,
    contents: str,
    fmt: Literal['yaml', 'json'] = 'yaml',
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (),
    default_name: str | None = None,
) -> Self
```

Load a dataset from a string.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance parsed from the string.

###### Parameters

**`contents`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The string content to parse.

**`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] _Default:_ `'yaml'`

Format of the content. Must be either 'yaml' or 'json'.

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones.

**`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones.

**`default_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default name of the dataset, to be used if not specified in the serialized contents.

###### Raises

-   `ValidationError` -- If the content cannot be parsed as a valid dataset.

##### from\_dict

`@classmethod`

```python
def from_dict(
    cls,
    data: dict[str, Any],
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (),
    default_name: str | None = None,
) -> Self
```

Load a dataset from a dictionary.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self) -- A new Dataset instance created from the dictionary.

###### Parameters

**`data`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

Dictionary representation of the dataset.

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones.

**`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom report evaluator classes to use when deserializing the dataset. These are additional report evaluators beyond the default ones.

**`default_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Default name of the dataset, to be used if not specified in the data.

###### Raises

-   `ValidationError` -- If the dictionary cannot be converted to a valid dataset.

##### to\_file

```python
def to_file(
    path: Path | str,
    fmt: Literal['yaml', 'json'] | None = None,
    schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE,
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (),
)
```

Save the dataset to a file.

###### Parameters

**`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

Path to save the dataset to.

**`fmt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['yaml', 'json'\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'.

**`schema_path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `DEFAULT_SCHEMA_PATH_TEMPLATE`

Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem.

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom evaluator classes to include in the schema.

**`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom report evaluator classes to include in the schema.

##### model\_json\_schema\_with\_evaluators

`@classmethod`

```python
def model_json_schema_with_evaluators(
    cls,
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    custom_report_evaluator_types: Sequence[type[ReportEvaluator[InputsT, OutputT, MetadataT]]] = (),
) -> dict[str, Any]
```

Generate a JSON schema for this dataset type, including evaluator details.

This is useful for generating a schema that can be used to validate YAML-format dataset files.

###### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- A dictionary representing the JSON schema.

###### Parameters

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom evaluator classes to include in the schema.

**`custom_report_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`ReportEvaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Custom report evaluator classes to include in the schema.

### set\_eval\_attribute

```python
def set_eval_attribute(name: str, value: Any) -> None
```

Set an attribute on the current task run.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the attribute.

**`value`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

The value of the attribute.

### increment\_eval\_metric

```python
def increment_eval_metric(name: str, amount: int | float) -> None
```

Increment a metric on the current task run.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the metric.

**`amount`** : [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float)

The amount to increment by.

### InputsT

Generic type for the inputs to the task being evaluated.

**Default:** `TypeVar('InputsT', default=Any)`

### OutputT

Generic type for the expected output of the task being evaluated.

**Default:** `TypeVar('OutputT', default=Any)`

### MetadataT

Generic type for the metadata associated with the task being evaluated.

**Default:** `TypeVar('MetadataT', default=Any)`

### DEFAULT\_DATASET\_PATH

Default path for saving/loading datasets.

**Default:** `'./test_cases.yaml'`

### DEFAULT\_SCHEMA\_PATH\_TEMPLATE

Default template for schema file paths, where {stem} is replaced with the dataset filename stem.

**Default:** `'./{stem}_schema.json'`

---

# [pydantic_evals.evaluators](https://pydantic.dev/docs/ai/api/pydantic_evals/evaluators/)

# pydantic\_evals.evaluators

### ReportEvaluatorContext

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

Context for report-level evaluation, containing the full experiment results.

#### Attributes

##### name

The experiment name.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### report

The full evaluation report.

**Type:** `EvaluationReport`\[`InputsT`, `OutputT`, `MetadataT`\]

##### experiment\_metadata

Experiment-level metadata.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

### Equals

**Bases:** `Evaluator[object, object, object]`

Check if the output exactly equals the provided value.

### EvaluatorContext

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

Context for evaluating a task execution.

An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data.

Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution.

Example:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        # Use the context to access task inputs, outputs, and expected outputs
        return ctx.output == ctx.expected_output
```

#### Attributes

##### name

The name of the case.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### inputs

The inputs provided to the task for this case.

**Type:** `InputsT`

##### metadata

Metadata associated with the case, if provided. May be None if no metadata was specified.

**Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### expected\_output

The expected output for the case, if provided. May be None if no expected output was specified.

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### output

The actual output produced by the task for this case.

**Type:** `OutputT`

##### duration

The duration of the task run for this case.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### attributes

Attributes associated with the task run for this case.

These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### metrics

Metrics associated with the task run for this case.

These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float)\]

##### span\_tree

Get the `SpanTree` for this task execution.

The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution.

**Type:** `SpanTree`

### EvaluationReason

The result of running an evaluator with an optional explanation.

Contains a scalar value and an optional "reason" explaining the value.

#### Constructor Parameters

**`value`** : `EvaluationScalar`

The scalar result of the evaluation (boolean, integer, float, or string).

**`reason`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

An optional explanation of the evaluation result.

### ReportEvaluator

**Bases:** `BaseEvaluator`, `Generic[InputsT, OutputT, MetadataT]`

Base class for experiment-wide evaluators that analyze full reports.

Unlike case-level Evaluators which assess individual task outputs, ReportEvaluators see all case results together and produce experiment-wide analyses like confusion matrices, precision-recall curves, or scalar statistics.

#### Methods

##### evaluate

`@abstractmethod`

```python
def evaluate(
    ctx: ReportEvaluatorContext[InputsT, OutputT, MetadataT],
) -> ReportAnalysis | list[ReportAnalysis] | Awaitable[ReportAnalysis | list[ReportAnalysis]]
```

Evaluate the full report and return experiment-wide analysis/analyses.

###### Returns

`ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\] | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\]\]

##### evaluate\_async

`@async`

```python
def evaluate_async(
    ctx: ReportEvaluatorContext[InputsT, OutputT, MetadataT],
) -> ReportAnalysis | list[ReportAnalysis]
```

Evaluate, handling both sync and async implementations.

###### Returns

`ReportAnalysis` | [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\]

### EqualsExpected

**Bases:** `Evaluator[object, object, object]`

Check if the output exactly equals the expected output.

### EvaluationResult

**Bases:** `Generic[EvaluationScalarT]`

The details of an individual evaluation result.

Contains the name, value, reason, and source evaluator for a single evaluation.

#### Constructor Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The name of the evaluation.

**`value`** : `EvaluationScalarT`

The scalar result of the evaluation.

**`reason`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

An optional explanation of the evaluation result.

**`source`** : `EvaluatorSpec`

The spec of the evaluator that produced this result.

**`evaluator_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional version tag for the evaluator that produced this result (e.g. `'v2'`). Sourced automatically from the `evaluator_version` class attribute on the `Evaluator` subclass. Lets online-evaluation dashboards filter out results from retired versions without deleting historical rows.

#### Methods

##### downcast

```python
def downcast(value_types: type[T] = ()) -> EvaluationResult[T] | None
```

Attempt to downcast this result to a more specific type.

###### Returns

`EvaluationResult`\[`T`\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- A downcast version of this result if the value is an instance of one of the given types, `EvaluationResult`\[`T`\] | [`None`](https://docs.python.org/3/library/constants.html#None) -- otherwise None.

###### Parameters

**`*value_types`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`T`\] _Default:_ `()`

The types to check the value against.

### Contains

**Bases:** `Evaluator[object, object, object]`

Check if the output contains the expected output.

For strings, checks if expected\_output is a substring of output. For lists/tuples, checks if expected\_output is in output. For dicts, checks if all key-value pairs in expected\_output are in output. For model-like types (BaseModel, dataclasses), converts to a dict and checks key-value pairs.

Note: case\_sensitive only applies when both the value and output are strings.

### ConfusionMatrixEvaluator

**Bases:** `ReportEvaluator`

Computes a confusion matrix from case data.

### EvaluatorFailure

Represents a failure raised during the execution of an evaluator.

#### Attributes

##### evaluator\_version

Optional version tag for the evaluator that raised (e.g. `'v2'`). Sourced automatically from the `evaluator_version` class attribute on the `Evaluator` subclass.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### error\_type

Class name of the exception that caused the failure (e.g. `'ValueError'`). Populated automatically when `EvaluatorFailure` is constructed from a caught exception; surfaced as the `error.type` attribute on emitted OTel events.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### Evaluator

**Bases:** `BaseEvaluator`, `Generic[InputsT, OutputT, MetadataT]`

Base class for all evaluators.

Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext.

Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`.

Example:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.output == ctx.expected_output
```

Optional class-level attributes:

-   `evaluation_name`: override the default name used in reports for this evaluator's output (only applies when `evaluate` returns a scalar or `EvaluationReason` -- mapping outputs always use their own keys).
-   `evaluator_version`: an optional version tag (e.g. `'v2'`) describing the evaluator's behavior. Propagated to online-evaluation sinks so dashboards can filter out results produced by retired versions without deleting rows. Applies to every result the evaluator emits. Bump whenever behavior changes in a way that invalidates prior scores.

Example:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class LLMJudge(Evaluator):
    evaluator_version = 'v2'  # bumped after prompt rewrite

    def evaluate(self, ctx: EvaluatorContext) -> bool: ...
```

#### Methods

##### name

`@classmethod`

`@deprecated`

```python
def name(cls) -> str
```

`name` has been renamed, use `get_serialization_name` instead.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### get\_default\_evaluation\_name

```python
def get_default_evaluation_name() -> str
```

Return the default name to use in reports for the output of this evaluator.

By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used.

This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information.

Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### evaluate

`@abstractmethod`

```python
def evaluate(
    ctx: EvaluatorContext[InputsT, OutputT, MetadataT],
) -> EvaluatorOutput | Awaitable[EvaluatorOutput]
```

Evaluate the task output in the given context.

This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable\[EvaluatorOutput\].

###### Returns

`EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- of evaluation names to either of those. Can be returned either synchronously or as an `EvaluatorOutput` | [`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`EvaluatorOutput`\] -- awaitable for asynchronous evaluation.

###### Parameters

**`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\]

The context containing the inputs, outputs, and metadata for evaluation.

##### evaluate\_sync

```python
def evaluate_sync(ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput
```

Run the evaluator synchronously, handling both sync and async implementations.

This method ensures synchronous execution by running any async evaluate implementation to completion using run\_until\_complete.

###### Returns

`EvaluatorOutput` -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` -- of evaluation names to either of those.

###### Parameters

**`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\]

The context containing the inputs, outputs, and metadata for evaluation.

##### evaluate\_async

`@async`

```python
def evaluate_async(
    ctx: EvaluatorContext[InputsT, OutputT, MetadataT],
) -> EvaluatorOutput
```

Run the evaluator asynchronously, handling both sync and async implementations.

This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly.

###### Returns

`EvaluatorOutput` -- The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping `EvaluatorOutput` -- of evaluation names to either of those.

###### Parameters

**`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\]

The context containing the inputs, outputs, and metadata for evaluation.

### IsInstance

**Bases:** `Evaluator[object, object, object]`

Check if the output is an instance of a type with the given name.

### MaxDuration

**Bases:** `Evaluator[object, object, object]`

Check if the execution time is under the specified maximum.

### PrecisionRecallEvaluator

**Bases:** `ReportEvaluator`

Computes a precision-recall curve from case data.

Returns both a `PrecisionRecall` chart and a `ScalarResult` with the AUC value. The AUC is computed at full resolution (every unique score threshold) for accuracy, while the chart points are downsampled to `n_thresholds` for display.

### OutputConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Configuration for the score and assertion outputs of the LLMJudge evaluator.

### LLMJudge

**Bases:** `Evaluator[object, object, object]`

Judge whether the output of a language model meets the criteria of a provided rubric.

If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-5.2', but can be overridden by calling [`set_default_judge_model`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.llm_as_a_judge.set_default_judge_model).

### ROCAUCEvaluator

**Bases:** `ReportEvaluator`

Computes an ROC curve and AUC from case data.

Returns a `LinePlot` with the ROC curve (plus a dashed random-baseline diagonal) and a `ScalarResult` with the AUC value.

### HasMatchingSpan

**Bases:** `Evaluator[object, object, object]`

Check if the span tree contains a span that matches the specified query.

### KolmogorovSmirnovEvaluator

**Bases:** `ReportEvaluator`

Computes a Kolmogorov-Smirnov plot and statistic from case data.

Plots the empirical CDFs of the score distribution for positive and negative cases, and computes the KS statistic (maximum vertical distance between the two CDFs).

Returns a `LinePlot` with the two CDF curves and a `ScalarResult` with the KS statistic.

### EvaluatorSpec

The specification of an evaluator to be run.

This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files.

In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`:

-   `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments
-   `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__`
-   `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__`

**Default:** `NamedSpec`

### EvaluatorOutput

Type for the output of an evaluator, which can be a scalar, an EvaluationReason, or a mapping of names to either.

**Default:** `EvaluationScalar | EvaluationReason | Mapping[str, EvaluationScalar | EvaluationReason]`

### GradingOutput

**Bases:** `BaseModel`

The output of a grading operation.

### judge\_output

`@async`

```python
def judge_output(
    output: Any,
    rubric: str,
    model: models.Model | models.KnownModelName | str | None = None,
    model_settings: ModelSettings | None = None,
) -> GradingOutput
```

Judge the output of a model based on a rubric.

If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function.

#### Returns

`GradingOutput`

### judge\_input\_output

`@async`

```python
def judge_input_output(
    inputs: Any,
    output: Any,
    rubric: str,
    model: models.Model | models.KnownModelName | str | None = None,
    model_settings: ModelSettings | None = None,
) -> GradingOutput
```

Judge the output of a model based on the inputs and a rubric.

If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function.

#### Returns

`GradingOutput`

### judge\_input\_output\_expected

`@async`

```python
def judge_input_output_expected(
    inputs: Any,
    output: Any,
    expected_output: Any,
    rubric: str,
    model: models.Model | models.KnownModelName | str | None = None,
    model_settings: ModelSettings | None = None,
) -> GradingOutput
```

Judge the output of a model based on the inputs and a rubric.

If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function.

#### Returns

`GradingOutput`

### judge\_output\_expected

`@async`

```python
def judge_output_expected(
    output: Any,
    expected_output: Any,
    rubric: str,
    model: models.Model | models.KnownModelName | str | None = None,
    model_settings: ModelSettings | None = None,
) -> GradingOutput
```

Judge the output of a model based on the expected output, output, and a rubric.

If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function.

#### Returns

`GradingOutput`

### set\_default\_judge\_model

```python
def set_default_judge_model(model: models.Model | models.KnownModelName) -> None
```

Set the default model used for judging.

This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

---

# [pydantic_evals.generation](https://pydantic.dev/docs/ai/api/pydantic_evals/generation/)

# pydantic\_evals.generation

Utilities for generating example datasets for pydantic\_evals.

This module provides functions for generating sample datasets for testing and examples, using LLMs to create realistic test data with proper structure.

### generate\_dataset

`@async`

```python
def generate_dataset(
    dataset_type: type[Dataset[InputsT, OutputT, MetadataT]],
    path: Path | str | None = None,
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    model: models.Model | models.KnownModelName = 'openai:gpt-5.2',
    n_examples: int = 3,
    extra_instructions: str | None = None,
) -> Dataset[InputsT, OutputT, MetadataT]
```

Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata.

This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas.

#### Returns

`Dataset`\[`InputsT`, `OutputT`, `MetadataT`\] -- A properly structured Dataset object with generated test cases.

#### Parameters

**`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional path to save the generated dataset. If provided, the dataset will be saved to this location.

**`dataset_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`Dataset`\[`InputsT`, `OutputT`, `MetadataT`\]\]

The type of dataset to generate, with the desired input, output, and metadata types.

**`custom_evaluator_types`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`Evaluator`\[`InputsT`, `OutputT`, `MetadataT`\]\]\] _Default:_ `()`

Optional sequence of custom evaluator classes to include in the schema.

**`model`** : [`models.Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) | [`models.KnownModelName`](/docs/ai/api/models/base/#pydantic_ai.models.KnownModelName) _Default:_ `'openai:gpt-5.2'`

The Pydantic AI model to use for generation. Defaults to 'openai:gpt-5.2'.

**`n_examples`** : [`int`](https://docs.python.org/3/library/functions.html#int) _Default:_ `3`

Number of examples to generate. Defaults to 3.

**`extra_instructions`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional instructions to provide to the LLM.

#### Raises

-   `ValidationError` -- If the LLM's response cannot be parsed as a valid dataset.

### InputsT

Generic type for the inputs to the task being evaluated.

**Default:** `TypeVar('InputsT', default=Any)`

### OutputT

Generic type for the expected output of the task being evaluated.

**Default:** `TypeVar('OutputT', default=Any)`

### MetadataT

Generic type for the metadata associated with the task being evaluated.

**Default:** `TypeVar('MetadataT', default=Any)`

---

# [pydantic_evals.lifecycle](https://pydantic.dev/docs/ai/api/pydantic_evals/lifecycle/)

# pydantic\_evals.lifecycle

Case lifecycle hooks for pydantic evals.

This module provides the [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) class, which allows defining setup, context preparation, and teardown hooks that run at different stages of case evaluation.

### CaseLifecycle

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

Per-case lifecycle hooks for evaluation.

A new instance is created for each case during evaluation. Subclass and override any methods you need -- all methods are no-ops by default.

The evaluation flow for each case is:

1.  `setup()` -- called before task execution
2.  Task runs
3.  `prepare_context()` -- called after task, before evaluators; can enrich metrics/attributes
4.  Evaluators run
5.  `teardown()` -- called after evaluators complete; receives the full result (or `None` when interrupted)

Exceptions raised by `setup()` or `prepare_context()` are caught and recorded as a `ReportCaseFailure`; `teardown()` is still called afterward so you can clean up. Exceptions raised by `teardown()` propagate to the caller and may abort the evaluation. If your teardown may raise and you don't want it to crash the evaluation run, handle exceptions within your `teardown()` implementation itself.

#### Constructor Parameters

**`case`** : `Case`\[`InputsT`, `OutputT`, `MetadataT`\]

The case being evaluated. Available as `self.case` in all hooks.

#### Attributes

##### case

The case being evaluated.

**Type:** `Case`\[`InputsT`, `OutputT`, `MetadataT`\]

#### Methods

##### setup

`@async`

```python
def setup() -> None
```

Called before task execution.

Override to perform per-case resource setup (e.g., create a test database, start a service). The case metadata is available via `self.case.metadata`.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### prepare\_context

`@async`

```python
def prepare_context(
    ctx: EvaluatorContext[InputsT, OutputT, MetadataT],
) -> EvaluatorContext[InputsT, OutputT, MetadataT]
```

Called after the task completes, before evaluators run.

Override to enrich the evaluator context with additional metrics or attributes derived from the task output, span tree, or external state.

###### Returns

`EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\] -- The (possibly modified) evaluator context to pass to evaluators.

###### Parameters

**`ctx`** : `EvaluatorContext`\[`InputsT`, `OutputT`, `MetadataT`\]

The evaluator context produced by the task run.

##### teardown

`@async`

```python
def teardown(
    result: ReportCase[InputsT, OutputT, MetadataT] | ReportCaseFailure[InputsT, OutputT, MetadataT] | None,
) -> None
```

Called after evaluators complete.

Override to perform per-case resource cleanup. The result is provided so that teardown logic can vary based on success/failure (e.g., keep resources up for inspection on failure).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`result`** : `ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\] | `ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

The evaluation result -- a `ReportCase` (success), `ReportCaseFailure`, or `None` if the run ended without a report object (e.g. cancellation).

---

# [pydantic_evals.online](https://pydantic.dev/docs/ai/api/pydantic_evals/online/)

# pydantic\_evals.online

Online evaluation -- attach evaluators to live functions for automatic background evaluation.

This module provides the infrastructure for running evaluators on production (or staging) traffic. The same `Evaluator` instances used with `Dataset.evaluate()` work here, the difference is in how they are wired up (decorator vs dataset) rather than what they are.

Example:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import evaluate


@dataclass
class IsNonEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


@evaluate(IsNonEmpty())
async def my_function(x: int) -> int:
    return x
```

### SinkPayload

Container passed to [`EvaluationSink.submit`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink.submit).

Do not instantiate directly

`SinkPayload` is constructed internally by pydantic-evals. We reserve the right to add fields in any release -- if you build your own instances, a future version may break your code. Sink implementations should accept the payload as-is and read only the fields they need.

#### Attributes

##### results

Evaluation results from the evaluator run.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationResult`\]

##### failures

Failures from the evaluator run if it raised.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluatorFailure`\]

##### context

The full evaluator context for the function call.

**Type:** `EvaluatorContext`

##### span\_reference

Reference to the OTel span for the function call, if available.

**Type:** `SpanReference` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### target

Identifies the function/agent being evaluated, supplied by the `@evaluate` decorator (defaults resolved at decoration time).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### EvaluationSink

**Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol)

Protocol for **additional** evaluation result destinations.

By default, online evaluation emits `gen_ai.evaluation.result` OTel events for every evaluator run -- no sink registration required. Sinks are the escape hatch for custom handling _in addition to_ OTel emission: in-memory test capture, fan-out to Slack/DB, non-OTel backends, alerting pipelines, etc. See [`OnlineEvalConfig.default_sink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig.default_sink).

To disable the default OTel emission (e.g. in tests that only want to assert on a custom sink), set [`emit_otel_events=False`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig.emit_otel_events) on the config.

#### Methods

##### submit

`@async`

```python
def submit(payload: SinkPayload) -> None
```

Submit evaluation results to the sink.

The payload may include results from one or more evaluators that ran for a given function call -- when multiple evaluators share this sink, their results are batched into a single `submit()` call. Each result carries enough metadata (name, evaluator version, source) to be attributed downstream; the exact batching behavior is an implementation detail and may change.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`payload`** : `SinkPayload`

A [`SinkPayload`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SinkPayload) bundling results, failures, context, span reference, and target. Sinks should read only the fields they need; new fields may be added in future releases.

### SamplingContext

Context available when deciding whether to sample an evaluator.

Contains the information available _before_ the decorated function runs -- the evaluator instance, function inputs, config metadata, and a per-call random seed. The function's output and duration are not yet available at sampling time.

#### Attributes

##### evaluator

The evaluator being sampled.

**Type:** `Evaluator`

##### inputs

The inputs to the decorated function.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### metadata

Metadata from the [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig), if set.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### call\_seed

A uniform random value in \[0, 1) generated once per decorated function call.

Shared across all evaluators for the same call. In `'correlated'` sampling mode this is used automatically; in `'independent'` mode it is available for custom `sample_rate` callables that want to implement their own correlated logic.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

### CallbackSink

An `EvaluationSink` that delegates to a user-provided callable.

The callback receives the results, failures, and context. Other fields on the [`SinkPayload`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SinkPayload) (such as `span_reference` and `target`) are not passed -- use a custom `EvaluationSink` implementation if you need them.

### SpanReference

Identifies a span that evaluation results should be associated with.

Used by sinks to associate evaluation results with the original function execution span.

#### Attributes

##### trace\_id

The trace ID of the span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### span\_id

The span ID of the span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### OnlineEvaluator

Wraps an `Evaluator` with per-evaluator online configuration.

Different evaluators often need different settings -- a cheap heuristic should run on 100% of traffic while an expensive LLM judge might run on only 1%.

#### Attributes

##### evaluator

The evaluator to run.

To version an evaluator, set `evaluator_version` as a class attribute on the `Evaluator` subclass itself (see `Evaluator` docstring). The framework reads it via `getattr` at dispatch time and propagates it to sinks alongside each result.

**Type:** `Evaluator`

##### sample\_rate

Probability of running this evaluator (0.0-1.0), or a callable returning a float or bool.

When a callable, it receives a [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) with the function inputs, config metadata, and evaluator name -- but not the output or duration (which aren't available yet at sampling time).

Defaults to `None`, which uses the config's `default_sample_rate` at each call. Set explicitly to override.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### max\_concurrency

Maximum number of concurrent evaluations for this evaluator.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int) **Default:** `10`

##### sink

Override additional sink(s) for this evaluator. If `None`, the config's `default_sink` is used.

Sinks are _additive_ to the default OTel event emission -- not replacements. See [`EvaluationSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink).

**Type:** `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_max\_concurrency

Called when an evaluation is dropped because `max_concurrency` was reached.

Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async. If `None` (the default), dropped evaluations are silently ignored.

**Type:** `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_sampling\_error

Called synchronously when a `sample_rate` callable raises an exception.

Receives the exception and the evaluator. Must be sync (not async), since sampling runs before the decorated function. If set, the evaluator is skipped. If `None`, uses the config's `on_sampling_error` default. If neither is set, the exception propagates to the caller.

**Type:** `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_error

Called when an exception occurs in a sink or on\_max\_concurrency callback.

Receives the exception, evaluator context, evaluator instance, and a location string (see [`OnErrorLocation`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnErrorLocation)). Can be sync or async. `'sink'` covers both custom sink failures and the rarer default OTel event emission failures -- the value is intentionally broad. If `None`, uses the config's `on_error` default. If neither is set, exceptions are silently suppressed.

**Type:** `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### EvaluatorContextSource

**Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol)

Protocol for retrieving stored evaluator contexts.

Implementations reconstruct [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) objects from stored traces (e.g., Logfire). The batch method allows fetching contexts for multiple spans in a single call.

#### Methods

##### fetch

`@async`

```python
def fetch(span: SpanReference) -> EvaluatorContext
```

Fetch an evaluator context for a single span.

###### Returns

`EvaluatorContext` -- The evaluator context for the span.

###### Parameters

**`span`** : `SpanReference`

Reference to the span to fetch context for.

##### fetch\_many

`@async`

```python
def fetch_many(spans: Sequence[SpanReference]) -> list[EvaluatorContext]
```

Fetch evaluator contexts for multiple spans in a single batch.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorContext`\] -- Evaluator contexts in the same order as the input spans.

###### Parameters

**`spans`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`SpanReference`\]

References to the spans to fetch context for.

### OnlineEvalConfig

Holds cross-evaluator defaults for online evaluation.

Create instances for different evaluation configurations, or use the global `DEFAULT_CONFIG` via the module-level `evaluate()` and `configure()` functions.

#### Attributes

##### default\_sink

Additional sink(s) to receive results, for evaluators that don't specify their own.

Sinks run _in addition to_ the default `gen_ai.evaluation.result` OTel event emission -- they are the escape hatch for custom destinations (in-memory test capture, fan-out to Slack/DB, non-OTel backends). To disable OTel emission itself, set [`emit_otel_events=False`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig.emit_otel_events).

**Type:** `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### default\_sample\_rate

Default sample rate for evaluators that don't specify their own.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] **Default:** `1.0`

##### emit\_otel\_events

Whether to emit `gen_ai.evaluation.result` OTel events for every evaluator run.

When `True` (the default), dispatch emits one OTel log event per `EvaluationResult` or `EvaluatorFailure`, following the [OTel GenAI evaluation semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/#event-gen_aievaluationresult). If no OTel SDK is configured in the process, emission is a cheap no-op.

Set to `False` to disable -- useful for tests that want to assert on a custom sink alone, or in environments where OTel emission is undesirable. Custom sinks registered via `default_sink` still run regardless of this flag. With `emit_otel_events=False` AND no sinks configured, dispatch short-circuits entirely (the evaluator never runs) since results would have nowhere to go.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### include\_baggage

Whether to copy OTel baggage entries onto every emitted evaluation event.

When `True` (the default), each emitted `gen_ai.evaluation.result` event also carries the keys present in the current OTel baggage as attributes -- useful for propagating tenant/user/request identifiers from the calling context. Standard `gen_ai.*` and `error.type` attributes always win on conflict, so baggage cannot accidentally overwrite the semantic-convention attributes.

Set to `False` to skip the baggage snapshot per event.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### sampling\_mode

Controls how per-evaluator sample rates interact for a single call.

-   `'independent'` (default): each evaluator decides independently.
-   `'correlated'`: a shared random seed is used so that lower-rate evaluators' calls are a subset of higher-rate ones, minimising total overhead.

See [`SamplingMode`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingMode) for details.

**Type:** `SamplingMode` **Default:** `'independent'`

##### enabled

Whether online evaluation is enabled for this config.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### metadata

Optional metadata to include in evaluator contexts.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_max\_concurrency

Default handler called when an evaluation is dropped because `max_concurrency` was reached.

Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async. If `None` (the default), dropped evaluations are silently ignored. Per-evaluator `OnlineEvaluator.on_max_concurrency` overrides this default.

**Type:** `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_sampling\_error

Default handler called synchronously when a `sample_rate` callable raises.

Receives the exception and the evaluator. Must be sync (not async). If set, the evaluator is skipped. If `None` (the default), the exception propagates to the caller. Per-evaluator `OnlineEvaluator.on_sampling_error` overrides this default.

**Type:** `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### on\_error

Default handler called when an exception occurs in a sink or on\_max\_concurrency callback.

Receives the exception, evaluator context, evaluator instance, and a location string (see [`OnErrorLocation`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnErrorLocation)). Can be sync or async. `'sink'` covers both custom sink failures and the rarer default OTel event emission failures -- the value is intentionally broad. If `None` (the default), exceptions are silently suppressed. Per-evaluator `OnlineEvaluator.on_error` overrides this default.

**Type:** `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### evaluate

```python
def evaluate(
    evaluators: Evaluator | OnlineEvaluator = (),
    target: str | None = None,
    msg_template: LiteralString | None = None,
    span_name: str | None = None,
    extract_args: bool | Iterable[str] = False,
    record_return: bool = False,
) -> Callable[[Callable[_P, _R]], Callable[_P, _R]]
```

Decorator to attach online evaluators to a function.

Each decorated call opens a dedicated span representing the function invocation -- evaluator events are parented to this span, and the span itself appears in the user's configured OTel/logfire traces.

Bare `Evaluator` instances are auto-wrapped in `OnlineEvaluator` at decoration time (so concurrency semaphores are shared across calls). Their `sample_rate` defaults to `None`, which resolves to the config's `default_sample_rate` at each call -- so changes to the config after decoration take effect.

To version an evaluator, set `evaluator_version` on the `Evaluator` subclass itself -- the framework reads it at dispatch time and records it on every [`EvaluationResult`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationResult) and [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure) the evaluator emits:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import evaluate


@dataclass
class Tone(Evaluator):
    evaluator_version = 'v2'

    def evaluate(self, ctx: EvaluatorContext) -> str:
        return 'neutral'


@evaluate(Tone())
async def summarize(text: str) -> str:
    return text
```

###### Returns

[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\], [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\] -- A decorator that wraps the function with online evaluation.

###### Parameters

**`*evaluators`** : `Evaluator` | `OnlineEvaluator` _Default:_ `()`

Evaluators to attach. Can be `Evaluator` or `OnlineEvaluator` instances.

**`target`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Name of the thing being evaluated. Written to sinks and emitted OTel events as `gen_ai.evaluation.target`. Defaults to the decorated function's `__name__` when omitted.

**`msg_template`** : [`LiteralString`](https://docs.python.org/3/library/typing.html#typing.LiteralString) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Template for the call span's message. Defaults to `"Calling {module}.{qualname}"` like `@logfire.instrument`. When logfire is installed, `{arg=}`\-style placeholders in the template are formatted against the function's arguments.

**`span_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override for the call span's name. Defaults to `msg_template`.

**`extract_args`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `False`

Whether to record function arguments as span attributes. `False` (default) records nothing; `True` records all bound arguments; an iterable of names records only those arguments. Requires logfire to be installed so arguments are serialised with their JSON schema -- raises `RuntimeError` at decoration time otherwise.

**`record_return`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to record the function's return value as a `return` span attribute. Requires logfire for the same reason as `extract_args`.

##### should\_evaluate

```python
def should_evaluate() -> bool
```

Whether evaluators with this config should run, based on the current settings and context.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

### disable\_evaluation

```python
def disable_evaluation() -> Iterator[None]
```

Context manager to disable all online evaluation in the current context.

When active, decorated functions still execute normally but no evaluators are dispatched.

#### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[[`None`](https://docs.python.org/3/library/constants.html#None)\]

### run\_evaluators

`@async`

```python
def run_evaluators(
    evaluators: Sequence[Evaluator],
    context: EvaluatorContext,
) -> tuple[list[EvaluationResult], list[EvaluatorFailure]]
```

Run evaluators on a context and return results.

Useful for re-running evaluators from stored data.

#### Returns

[`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluationResult`\], [`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorFailure`\]\] -- A tuple of (results, failures).

#### Parameters

**`evaluators`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Evaluator`\]

The evaluators to run.

**`context`** : `EvaluatorContext`

The evaluator context to evaluate against.

### evaluate

```python
def evaluate(
    evaluators: Evaluator | OnlineEvaluator = (),
    target: str | None = None,
    msg_template: LiteralString | None = None,
    span_name: str | None = None,
    extract_args: bool | Iterable[str] = False,
    record_return: bool = False,
) -> Callable[[Callable[_P, _R]], Callable[_P, _R]]
```

Decorator to attach online evaluators to a function using the global default config.

Equivalent to `DEFAULT_CONFIG.evaluate(...)`.

Example:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import evaluate


@dataclass
class IsNonEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


@evaluate(IsNonEmpty())
async def my_function(x: int) -> int:
    return x
```

#### Returns

[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\], [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[`_P`, `_R`\]\] -- A decorator that wraps the function with online evaluation.

#### Parameters

**`*evaluators`** : `Evaluator` | `OnlineEvaluator` _Default:_ `()`

Evaluators to attach. Can be `Evaluator` or `OnlineEvaluator` instances.

**`target`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Name of the thing being evaluated. Written to sinks and emitted OTel events as `gen_ai.evaluation.target`. Defaults to the decorated function's `__name__` when omitted.

**`msg_template`** : [`LiteralString`](https://docs.python.org/3/library/typing.html#typing.LiteralString) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Template for the call span's message. Defaults to `"Calling {module}.{qualname}"` like `@logfire.instrument`.

**`span_name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Override for the call span's name. Defaults to `msg_template`.

**`extract_args`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `False`

Whether to record function arguments as span attributes. `False` (default) records nothing; `True` records all bound arguments; an iterable of names records only those arguments. Requires logfire to be installed -- raises `RuntimeError` at decoration time otherwise.

**`record_return`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to record the function's return value as a `return` span attribute. Requires logfire for the same reason as `extract_args`.

### configure

```python
def configure(
    default_sink: EvaluationSink | Sequence[EvaluationSink | SinkCallback] | SinkCallback | None | Unset = UNSET,
    default_sample_rate: float | Callable[[SamplingContext], float | bool] | Unset = UNSET,
    sampling_mode: SamplingMode | Unset = UNSET,
    enabled: bool | Unset = UNSET,
    metadata: dict[str, Any] | None | Unset = UNSET,
    on_max_concurrency: OnMaxConcurrencyCallback | None | Unset = UNSET,
    on_sampling_error: OnSamplingErrorCallback | None | Unset = UNSET,
    on_error: OnErrorCallback | None | Unset = UNSET,
    emit_otel_events: bool | Unset = UNSET,
    include_baggage: bool | Unset = UNSET,
) -> None
```

Configure the global default `OnlineEvalConfig`.

Only provided values are updated; unset arguments are ignored. Pass `None` explicitly to clear `default_sink`, `metadata`, `on_max_concurrency`, `on_sampling_error`, or `on_error`.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Parameters

**`default_sink`** : `EvaluationSink` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`EvaluationSink` | `SinkCallback`\] | `SinkCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET`

Default sink(s) for evaluators. Pass `None` to clear.

**`default_sample_rate`** : [`float`](https://docs.python.org/3/library/functions.html#float) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`SamplingContext`\], [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | `Unset` _Default:_ `UNSET`

Default sample rate for evaluators.

**`sampling_mode`** : `SamplingMode` | `Unset` _Default:_ `UNSET`

Sampling mode (`'independent'` or `'correlated'`).

**`enabled`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | `Unset` _Default:_ `UNSET`

Whether online evaluation is enabled.

**`metadata`** : [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET`

Metadata to include in evaluator contexts. Pass `None` to clear.

**`on_max_concurrency`** : `OnMaxConcurrencyCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET`

Default handler for dropped evaluations. Pass `None` to clear.

**`on_sampling_error`** : `OnSamplingErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET`

Default handler for sample\_rate exceptions. Pass `None` to clear.

**`on_error`** : `OnErrorCallback` | [`None`](https://docs.python.org/3/library/constants.html#None) | `Unset` _Default:_ `UNSET`

Default handler for pipeline exceptions. Pass `None` to clear.

**`emit_otel_events`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | `Unset` _Default:_ `UNSET`

Whether to emit `gen_ai.evaluation.result` OTel events.

**`include_baggage`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) | `Unset` _Default:_ `UNSET`

Whether to copy current OTel baggage onto every emitted event.

### wait\_for\_evaluations

`@async`

```python
def wait_for_evaluations(timeout: float = 30.0) -> None
```

Wait for all pending background evaluation tasks and threads to complete.

This is useful in tests to deterministically wait for background evaluators to finish instead of relying on timing-based sleeps.

For async decorated functions, evaluators run as tasks on the caller's event loop and are awaited directly. For sync decorated functions, evaluators run in background threads which are joined with the given timeout.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Parameters

**`timeout`** : [`float`](https://docs.python.org/3/library/functions.html#float) _Default:_ `30.0`

Maximum seconds to wait for each background thread. Defaults to 30.

### OnErrorLocation

The location within the online evaluation pipeline where an error occurred.

-   `'sink'` -- something went wrong delivering results downstream. This is most often an exception raised by a registered [`EvaluationSink.submit`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink.submit), but it's also used as a catch-all for failures in the default OTel event emission path (which is rare in practice; the OTel SDK rarely raises during `emit()`).
-   `'on_max_concurrency'` -- the evaluator's `on_max_concurrency` callback itself raised while being notified about a dropped evaluation.

**Default:** `Literal['sink', 'on_max_concurrency']`

### SamplingMode

Controls how per-evaluator sample rates interact across evaluators for a single call.

-   `'independent'` (default): Each evaluator flips its own coin. With N evaluators each at rate _r_, the probability of _any_ evaluation overhead is `1 − (1−r)^N`.
-   `'correlated'`: A single random seed is generated per call and shared across evaluators. An evaluator runs when `call_seed < rate`, so lower-rate evaluators' calls are always a subset of higher-rate ones. The probability of _any_ overhead equals `max(rate_i)`.

**Default:** `Literal['independent', 'correlated']`

### OnMaxConcurrencyCallback

Callback invoked when an evaluation is dropped due to concurrency limits.

Receives the `EvaluatorContext` that would have been evaluated. Can be sync or async.

**Default:** `Callable[[EvaluatorContext], None | Awaitable[None]]`

### OnSamplingErrorCallback

Callback invoked when a `sample_rate` callable raises an exception.

Called synchronously before the decorated function runs. Receives the exception and the evaluator whose `sample_rate` failed. Must be sync (not async). If set, the evaluator is skipped. If not set, the exception propagates to the caller.

**Default:** `Callable[[Exception, Evaluator], None]`

### OnErrorCallback

Callback invoked when an exception occurs in the online evaluation pipeline.

Receives the exception, the evaluator context, the evaluator instance, and a location string indicating where the error occurred. Can be sync or async.

**Default:** `Callable[[Exception, EvaluatorContext, Evaluator, OnErrorLocation], None | Awaitable[None]]`

### SinkCallback

Type alias for bare callables accepted wherever an `EvaluationSink` is expected.

Auto-wrapped in `CallbackSink` when passed as a `sink` parameter.

**Default:** `Callable[[Sequence[EvaluationResult], Sequence[EvaluatorFailure], EvaluatorContext], None | Awaitable[None]]`

### DEFAULT\_CONFIG

The global default `OnlineEvalConfig` instance.

Module-level functions like `evaluate()` and `configure()` delegate to this instance.

**Default:** `OnlineEvalConfig()`

---

# [pydantic_evals.online_capability](https://pydantic.dev/docs/ai/api/pydantic_evals/online_capability/)

# pydantic\_evals.online\_capability

Online evaluation capability for pydantic-ai agents.

Provides an `OnlineEvaluation` capability that attaches evaluators to agent runs, dispatching them asynchronously in the background after each run completes.

### OnlineEvaluation

**Bases:** `AbstractCapability[AgentDepsT]`

Capability that runs online evaluators on agent run results.

Dispatches evaluators asynchronously in the background after each completed agent run. Non-blocking -- the agent run returns without waiting for evaluators to finish.

Note

[`OnlineEvaluation`](/docs/ai/api/pydantic_evals/online_capability/#pydantic_evals.online_capability.OnlineEvaluation) wraps `agent.run()`, `agent.run_stream()`, and [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) when the run reaches a final result. For streaming runs, evaluators are dispatched only after the final result is available and the surrounding context manager exits.

Example:

```python
from dataclasses import dataclass

from pydantic_ai import Agent
from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online_capability import OnlineEvaluation


@dataclass
class OutputNotEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


agent = Agent(
    'openai:gpt-5.2',
    name='assistant',
    capabilities=[OnlineEvaluation(evaluators=[OutputNotEmpty()])],
)
```

#### Attributes

##### evaluators

Evaluators to run after each agent run.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Evaluator` | `OnlineEvaluator`\]

##### config

Optional config override. Defaults to the global `DEFAULT_CONFIG`.

**Type:** `OnlineEvalConfig` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

---

# [pydantic_evals.otel](https://pydantic.dev/docs/ai/api/pydantic_evals/otel/)

# pydantic\_evals.otel

### SpanQuery

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A serializable query for filtering SpanNodes based on various conditions.

All fields are optional and combined with AND logic by default.

#### Attributes

##### stop\_recursing\_when

If present, stop recursing through ancestors or descendants at nodes that match this condition.

**Type:** `SpanQuery`

### SpanNode

A node in the span tree; provides references to parents/children for easy traversal and queries.

#### Attributes

##### duration

Return the span's duration as a timedelta, or None if start/end not set.

**Type:** `timedelta`

##### descendants

Return all descendants of this node in DFS order.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

##### ancestors

Return all ancestors of this node.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

#### Methods

##### add\_child

```python
def add_child(child: SpanNode) -> None
```

Attach a child node to this node's list of children.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### find\_children

```python
def find_children(predicate: SpanQuery | SpanPredicate) -> list[SpanNode]
```

Return all immediate children that satisfy the given predicate.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

##### first\_child

```python
def first_child(predicate: SpanQuery | SpanPredicate) -> SpanNode | None
```

Return the first immediate child that satisfies the given predicate, or None if none match.

###### Returns

`SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### any\_child

```python
def any_child(predicate: SpanQuery | SpanPredicate) -> bool
```

Returns True if there is at least one child that satisfies the predicate.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### find\_descendants

```python
def find_descendants(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> list[SpanNode]
```

Return all descendant nodes that satisfy the given predicate in DFS order.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

##### first\_descendant

```python
def first_descendant(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> SpanNode | None
```

DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.

###### Returns

`SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### any\_descendant

```python
def any_descendant(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> bool
```

Returns `True` if there is at least one descendant that satisfies the predicate.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### find\_ancestors

```python
def find_ancestors(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> list[SpanNode]
```

Return all ancestors that satisfy the given predicate.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

##### first\_ancestor

```python
def first_ancestor(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> SpanNode | None
```

Return the closest ancestor that satisfies the given predicate, or `None` if none match.

###### Returns

`SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### any\_ancestor

```python
def any_ancestor(
    predicate: SpanQuery | SpanPredicate,
    stop_recursing_when: SpanQuery | SpanPredicate | None = None,
) -> bool
```

Returns True if any ancestor satisfies the predicate.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### matches

```python
def matches(query: SpanQuery | SpanPredicate) -> bool
```

Check if the span node matches the query conditions or predicate.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### repr\_xml

```python
def repr_xml(
    include_children: bool = True,
    include_trace_id: bool = False,
    include_span_id: bool = False,
    include_start_timestamp: bool = False,
    include_duration: bool = False,
) -> str
```

Return an XML-like string representation of the node.

Optionally includes children, trace\_id, span\_id, start\_timestamp, and duration.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### SpanTree

A container that builds a hierarchy of SpanNode objects from a list of finished spans.

You can then search or iterate the tree to make your assertions (using DFS for traversal).

#### Methods

##### add\_spans

```python
def add_spans(spans: list[SpanNode]) -> None
```

Add a list of spans to the tree, rebuilding the tree structure.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### find

```python
def find(predicate: SpanQuery | SpanPredicate) -> list[SpanNode]
```

Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`SpanNode`\]

##### first

```python
def first(predicate: SpanQuery | SpanPredicate) -> SpanNode | None
```

Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.

###### Returns

`SpanNode` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### any

```python
def any(predicate: SpanQuery | SpanPredicate) -> bool
```

Returns True if any node in the tree matches the predicate.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### \_\_iter\_\_

```python
def __iter__() -> Iterator[SpanNode]
```

Return an iterator over all nodes in the tree.

###### Returns

[`Iterator`](https://docs.python.org/3/library/typing.html#typing.Iterator)\[`SpanNode`\]

##### repr\_xml

```python
def repr_xml(
    include_children: bool = True,
    include_trace_id: bool = False,
    include_span_id: bool = False,
    include_start_timestamp: bool = False,
    include_duration: bool = False,
) -> str
```

Return an XML-like string representation of the tree, optionally including children, trace\_id, span\_id, duration, and timestamps.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

---

# [pydantic_evals.reporting](https://pydantic.dev/docs/ai/api/pydantic_evals/reporting/)

# pydantic\_evals.reporting

### ConfusionMatrix

**Bases:** `BaseModel`

A confusion matrix comparing expected vs predicted labels across cases.

#### Attributes

##### class\_labels

Ordered list of class labels (used for both axes).

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### matrix

matrixexpected\_idx = count of cases.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`int`](https://docs.python.org/3/library/functions.html#int)\]\]

### PrecisionRecallPoint

**Bases:** `BaseModel`

A single point on a precision-recall curve.

### PrecisionRecallCurve

**Bases:** `BaseModel`

A single precision-recall curve.

#### Attributes

##### name

Name of this curve (e.g., experiment name or evaluator name).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### points

Points on the curve, ordered by threshold.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`PrecisionRecallPoint`\]

##### auc

Area under the precision-recall curve.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### PrecisionRecall

**Bases:** `BaseModel`

Precision-recall curve data across cases.

#### Attributes

##### curves

One or more curves.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`PrecisionRecallCurve`\]

### ScalarResult

**Bases:** `BaseModel`

A single scalar statistic (e.g., F1 score, accuracy, BLEU).

#### Attributes

##### unit

Optional unit label (e.g., '%', 'ms').

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### ReportCase

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

A single case in an evaluation report.

#### Attributes

##### name

The name of the [case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### inputs

The inputs to the task, from [`Case.inputs`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.inputs).

**Type:** `InputsT`

##### metadata

Any metadata associated with the case, from [`Case.metadata`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.metadata).

**Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### expected\_output

The expected output of the task, from [`Case.expected_output`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.expected_output).

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### output

The output of the task execution.

**Type:** `OutputT`

##### source\_case\_name

The original case name before run-indexing. Serves as the aggregation key for multi-run experiments. None when repeat == 1.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### trace\_id

The trace ID of the case span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### span\_id

The span ID of the case span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### TableResult

**Bases:** `BaseModel`

A generic table of data (fallback for custom analyses).

#### Attributes

##### columns

Column headers.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### rows

Row data, one list per row.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`int`](https://docs.python.org/3/library/functions.html#int) | [`float`](https://docs.python.org/3/library/functions.html#float) | [`bool`](https://docs.python.org/3/library/functions.html#bool) | [`None`](https://docs.python.org/3/library/constants.html#None)\]\]

### LinePlot

**Bases:** `BaseModel`

A generic XY line plot with labeled axes, supporting multiple curves.

Use this for ROC curves, KS plots, calibration curves, or any custom line chart that doesn't fit the specific PrecisionRecall type.

#### Attributes

##### x\_label

Label for the x-axis.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### y\_label

Label for the y-axis.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### x\_range

Optional fixed range for x-axis (min, max).

**Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`float`](https://docs.python.org/3/library/functions.html#float), [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### y\_range

Optional fixed range for y-axis (min, max).

**Type:** [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple)\[[`float`](https://docs.python.org/3/library/functions.html#float), [`float`](https://docs.python.org/3/library/functions.html#float)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### curves

One or more curves to plot.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`LinePlotCurve`\]

### ReportCaseFailure

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

A single case in an evaluation report that failed due to an error during task execution.

#### Attributes

##### name

The name of the [case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### inputs

The inputs to the task, from [`Case.inputs`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.inputs).

**Type:** `InputsT`

##### metadata

Any metadata associated with the case, from [`Case.metadata`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.metadata).

**Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### expected\_output

The expected output of the task, from [`Case.expected_output`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case.expected_output).

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### error\_message

The message of the exception that caused the failure.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### error\_stacktrace

The stacktrace of the exception that caused the failure.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### source\_case\_name

The original case name before run-indexing. Serves as the aggregation key for multi-run experiments. None when repeat == 1.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### trace\_id

The trace ID of the case span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### span\_id

The span ID of the case span.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### ReportCaseGroup

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

Grouped results from running the same case multiple times.

This is a computed view, not stored data. Obtain via `EvaluationReport.case_groups()`.

#### Attributes

##### name

The original case name (shared across all runs).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### inputs

The inputs (same for all runs).

**Type:** `InputsT`

##### metadata

The metadata (same for all runs).

**Type:** `MetadataT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### expected\_output

The expected output (same for all runs).

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None)

##### runs

Individual run results.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\]\]

##### failures

Runs that failed with exceptions.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\]\]

##### summary

Aggregated statistics across runs.

**Type:** `ReportCaseAggregate`

### ReportCaseAggregate

**Bases:** `BaseModel`

A synthetic case that summarizes a set of cases.

#### Methods

##### average

`@staticmethod`

```python
def average(cases: list[ReportCase]) -> ReportCaseAggregate
```

Produce a synthetic "summary" case by averaging quantitative attributes.

###### Returns

`ReportCaseAggregate`

##### average\_from\_aggregates

`@staticmethod`

```python
def average_from_aggregates(
    aggregates: list[ReportCaseAggregate],
) -> ReportCaseAggregate
```

Average across multiple aggregates (used for multi-run experiment summaries).

###### Returns

`ReportCaseAggregate`

### EvaluationReport

**Bases:** `Generic[InputsT, OutputT, MetadataT]`

A report of the results of evaluating a model on a set of cases.

#### Attributes

##### name

The name of the report.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### cases

The cases in the report.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCase`\[`InputsT`, `OutputT`, `MetadataT`\]\]

##### failures

The failures in the report. These are cases where task execution raised an exception.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCaseFailure`\[`InputsT`, `OutputT`, `MetadataT`\]\] **Default:** `field(default_factory=(list[ReportCaseFailure[InputsT, OutputT, MetadataT]]))`

##### analyses

Experiment-wide analyses produced by report evaluators.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportAnalysis`\] **Default:** `field(default_factory=(list[ReportAnalysis]))`

##### report\_evaluator\_failures

Failures from report evaluators that raised exceptions.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`EvaluatorFailure`\] **Default:** `field(default_factory=(list[EvaluatorFailure]))`

##### experiment\_metadata

Metadata associated with the specific experiment represented by this report.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### trace\_id

The trace ID of the evaluation.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### span\_id

The span ID of the evaluation.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

#### Methods

##### case\_groups

```python
def case_groups() -> list[ReportCaseGroup[InputsT, OutputT, MetadataT]] | None
```

Group cases by source\_case\_name and compute per-group aggregates.

Returns None if no cases have source\_case\_name set (i.e., single-run experiment).

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`ReportCaseGroup`\[`InputsT`, `OutputT`, `MetadataT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### render

```python
def render(
    width: int | None = None,
    baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None,
    include_input: bool = False,
    include_metadata: bool = False,
    include_expected_output: bool = False,
    include_output: bool = False,
    include_durations: bool = True,
    include_total_duration: bool = False,
    include_removed_cases: bool = False,
    include_averages: bool = True,
    include_errors: bool = True,
    include_error_stacktrace: bool = False,
    include_evaluator_failures: bool = True,
    include_analyses: bool = True,
    input_config: RenderValueConfig | None = None,
    metadata_config: RenderValueConfig | None = None,
    output_config: RenderValueConfig | None = None,
    score_configs: dict[str, RenderNumberConfig] | None = None,
    label_configs: dict[str, RenderValueConfig] | None = None,
    metric_configs: dict[str, RenderNumberConfig] | None = None,
    duration_config: RenderNumberConfig | None = None,
    include_reasons: bool = False,
) -> str
```

Render this report to a nicely-formatted string, optionally comparing it to a baseline report.

If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### print

```python
def print(
    width: int | None = None,
    baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None,
    console: Console | None = None,
    include_input: bool = False,
    include_metadata: bool = False,
    include_expected_output: bool = False,
    include_output: bool = False,
    include_durations: bool = True,
    include_total_duration: bool = False,
    include_removed_cases: bool = False,
    include_averages: bool = True,
    include_errors: bool = True,
    include_error_stacktrace: bool = False,
    include_evaluator_failures: bool = True,
    include_analyses: bool = True,
    input_config: RenderValueConfig | None = None,
    metadata_config: RenderValueConfig | None = None,
    output_config: RenderValueConfig | None = None,
    score_configs: dict[str, RenderNumberConfig] | None = None,
    label_configs: dict[str, RenderValueConfig] | None = None,
    metric_configs: dict[str, RenderNumberConfig] | None = None,
    duration_config: RenderNumberConfig | None = None,
    include_reasons: bool = False,
) -> None
```

Print this report to the console, optionally comparing it to a baseline report.

If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### console\_table

```python
def console_table(
    baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None,
    include_input: bool = False,
    include_metadata: bool = False,
    include_expected_output: bool = False,
    include_output: bool = False,
    include_durations: bool = True,
    include_total_duration: bool = False,
    include_removed_cases: bool = False,
    include_averages: bool = True,
    include_evaluator_failures: bool = True,
    input_config: RenderValueConfig | None = None,
    metadata_config: RenderValueConfig | None = None,
    output_config: RenderValueConfig | None = None,
    score_configs: dict[str, RenderNumberConfig] | None = None,
    label_configs: dict[str, RenderValueConfig] | None = None,
    metric_configs: dict[str, RenderNumberConfig] | None = None,
    duration_config: RenderNumberConfig | None = None,
    include_reasons: bool = False,
    with_title: bool = True,
) -> Table
```

Return a table containing the data from this report.

If a baseline is provided, returns a diff between this report and the baseline report. Optionally include input and output details.

###### Returns

`Table`

##### failures\_table

```python
def failures_table(
    include_input: bool = False,
    include_metadata: bool = False,
    include_expected_output: bool = False,
    include_error_message: bool = True,
    include_error_stacktrace: bool = True,
    input_config: RenderValueConfig | None = None,
    metadata_config: RenderValueConfig | None = None,
) -> Table
```

Return a table containing the failures in this report.

###### Returns

`Table`

##### \_\_str\_\_

```python
def __str__() -> str
```

Return a string representation of the report.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

### RenderValueConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A configuration for rendering a values in an Evaluation report.

### RenderNumberConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

A configuration for rendering a particular score or metric in an Evaluation report.

See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering.

#### Attributes

##### value\_formatter

The logic to use for formatting values.

-   If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures.
-   You can also use a custom string format spec, e.g. '{:.3f}'
-   You can also use a custom function, e.g. lambda x: f'{x:.3f}'

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### diff\_formatter

The logic to use for formatting details about the diff.

The strings produced by the value\_formatter will always be included in the reports, but the diff\_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference.

-   If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change.
-   You can also use a custom string format spec, e.g. '{:+.3f}'
-   You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added.
-   You can also use None to never generate extra diff text.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int), [`float`](https://docs.python.org/3/library/functions.html#float) | [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### diff\_atol

The absolute tolerance for considering a difference "significant".

A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`.

If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.)

If not provided, use 1e-6.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### diff\_rtol

The relative tolerance for considering a difference "significant".

See the description of `diff_atol` for more details about what makes a difference "significant".

If not provided, use 0.001 if all values are ints, otherwise 0.05.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float)

##### diff\_increase\_style

The style to apply to diffed values that have a significant increase.

See the description of `diff_atol` for more details about what makes a difference "significant".

If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red".

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### diff\_decrease\_style

The style to apply to diffed values that have significant decrease.

See the description of `diff_atol` for more details about what makes a difference "significant".

If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red".

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### ReportCaseRenderer

#### Methods

##### build\_base\_table

```python
def build_base_table(title: str) -> Table
```

Build and return a Rich Table for the diff output.

###### Returns

`Table`

##### build\_failures\_table

```python
def build_failures_table(title: str) -> Table
```

Build and return a Rich Table for the failures output.

###### Returns

`Table`

##### build\_row

```python
def build_row(case: ReportCase) -> list[str]
```

Build a table row for a single case.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### build\_aggregate\_row

```python
def build_aggregate_row(aggregate: ReportCaseAggregate) -> list[str]
```

Build a table row for an aggregated case.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### build\_diff\_row

```python
def build_diff_row(new_case: ReportCase, baseline: ReportCase) -> list[str]
```

Build a table row for a given case ID.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### build\_diff\_aggregate\_row

```python
def build_diff_aggregate_row(
    new: ReportCaseAggregate,
    baseline: ReportCaseAggregate,
) -> list[str]
```

Build a table row for a given case ID.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### build\_failure\_row

```python
def build_failure_row(case: ReportCaseFailure) -> list[str]
```

Build a table row for a single case failure.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

### EvaluationRenderer

A class for rendering an EvalReport or the diff between two EvalReports.

#### Methods

##### build\_table

```python
def build_table(report: EvaluationReport, with_title: bool = True) -> Table
```

Build a table for the report.

###### Returns

`Table` -- A Rich Table object

###### Parameters

**`report`** : `EvaluationReport`

The evaluation report to render

**`with_title`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the title in the table (default True)

##### build\_diff\_table

```python
def build_diff_table(
    report: EvaluationReport,
    baseline: EvaluationReport,
    with_title: bool = True,
) -> Table
```

Build a diff table comparing report to baseline.

###### Returns

`Table` -- A Rich Table object

###### Parameters

**`report`** : `EvaluationReport`

The evaluation report to compare

**`baseline`** : `EvaluationReport`

The baseline report to compare against

**`with_title`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include the title in the table (default True)

### ReportAnalysis

Discriminated union of all report-level analysis types.

**Default:** `Annotated[ConfusionMatrix | PrecisionRecall | ScalarResult | TableResult | LinePlot, Discriminator('type')]`

---

# [pydantic_graph.beta](https://pydantic.dev/docs/ai/api/pydantic_graph/beta/)

# pydantic\_graph.beta

The next version of the pydantic-graph framework with enhanced graph execution capabilities.

This module provides a parallel control flow graph execution framework with support for:

-   'Step' nodes for task execution
-   'Decision' nodes for conditional branching
-   'Fork' nodes for parallel execution coordination
-   'Join' nodes and 'Reducer's for re-joining parallel executions
-   Mermaid diagram generation for graph visualization

### TypeExpression

**Bases:** `Generic[T]`

A workaround for type checker limitations when using complex type expressions.

This class serves as a wrapper for types that cannot normally be used in positions requiring `type[T]`, such as `Any`, `Union[...]`, or `Literal[...]`. It provides a way to pass these complex type expressions to functions expecting concrete types.

### StepContext

**Bases:** `Generic[StateT, DepsT, InputT]`

Context information passed to step functions during graph execution.

The step context provides access to the current graph state, dependencies, and input data for a step.

#### Attributes

##### inputs

The input data for this step.

This must be a property to ensure correct variance behavior

**Type:** `InputT`

### StartNode

**Bases:** `Generic[OutputT]`

Entry point node for graph execution.

The StartNode represents the beginning of a graph execution flow.

#### Attributes

##### id

Fixed identifier for the start node.

**Default:** `NodeID('__start__')`

### EndNode

**Bases:** `Generic[InputT]`

Terminal node representing the completion of graph execution.

The EndNode marks the successful completion of a graph execution flow and can collect the final output data.

#### Attributes

##### id

Fixed identifier for the end node.

**Default:** `NodeID('__end__')`

### GraphBuilder

**Bases:** `Generic[StateT, DepsT, GraphInputT, GraphOutputT]`

A builder for constructing executable graph definitions.

GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation.

#### Attributes

##### name

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### state\_type

The type of the graph state.

**Type:** `TypeOrTypeExpression`\[`StateT`\] **Default:** `state_type`

##### deps\_type

The type of the dependencies.

**Type:** `TypeOrTypeExpression`\[`DepsT`\] **Default:** `deps_type`

##### input\_type

The type of the graph input data.

**Type:** `TypeOrTypeExpression`\[`GraphInputT`\] **Default:** `input_type`

##### output\_type

The type of the graph output data.

**Type:** `TypeOrTypeExpression`\[`GraphOutputT`\] **Default:** `output_type`

##### auto\_instrument

Whether to automatically create instrumentation spans.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `auto_instrument`

##### start\_node

Get the start node for the graph.

**Type:** `StartNode`\[`GraphInputT`\]

##### end\_node

Get the end node for the graph.

**Type:** `EndNode`\[`GraphOutputT`\]

#### Methods

##### \_\_init\_\_

```python
def __init__(
    name: str | None = None,
    state_type: TypeOrTypeExpression[StateT] = NoneType,
    deps_type: TypeOrTypeExpression[DepsT] = NoneType,
    input_type: TypeOrTypeExpression[GraphInputT] = NoneType,
    output_type: TypeOrTypeExpression[GraphOutputT] = NoneType,
    auto_instrument: bool = True,
)
```

Initialize a graph builder.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**`state_type`** : `TypeOrTypeExpression`\[`StateT`\] _Default:_ `NoneType`

The type of the graph state

**`deps_type`** : `TypeOrTypeExpression`\[`DepsT`\] _Default:_ `NoneType`

The type of the dependencies

**`input_type`** : `TypeOrTypeExpression`\[`GraphInputT`\] _Default:_ `NoneType`

The type of the graph input data

**`output_type`** : `TypeOrTypeExpression`\[`GraphOutputT`\] _Default:_ `NoneType`

The type of the graph output data

**`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to automatically create instrumentation spans

##### step

```python
def step(
    node_id: str | None = None,
    label: str | None = None,
) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]]
def step(
    call: StepFunction[StateT, DepsT, InputT, OutputT],
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, OutputT]
```

Create a step from a step function.

This method can be used as a decorator or called directly to create a step node from an async function.

###### Returns

`Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\] -- Either a Step instance or a decorator function

###### Parameters

**`call`** : `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The step function to wrap

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional human-readable label

##### stream

```python
def stream(
    node_id: str | None = None,
    label: str | None = None,
) -> Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]]
def stream(
    call: StreamFunction[StateT, DepsT, InputT, OutputT],
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]
def stream(
    call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None,
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]]
```

Create a step from an async iterator (which functions like a "stream").

This method can be used as a decorator or called directly to create a step node from an async function.

###### Returns

`Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\]\] -- Either a Step instance or a decorator function

###### Parameters

**`call`** : `StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The step function to wrap

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional human-readable label

##### add

```python
def add(edges: EdgePath[StateT, DepsT] = ()) -> None
```

Add one or more edge paths to the graph.

This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`*edges`** : `EdgePath`\[`StateT`, `DepsT`\] _Default:_ `()`

The edge paths to add to the graph

##### add\_edge

```python
def add_edge(
    source: Source[T],
    destination: Destination[T],
    label: str | None = None,
) -> None
```

Add a simple edge between two nodes.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : `Source`\[`T`\]

The source node

**`destination`** : `Destination`\[`T`\]

The destination node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label for the edge

##### add\_mapping\_edge

```python
def add_mapping_edge(
    source: Source[Iterable[T]],
    map_to: Destination[T],
    pre_map_label: str | None = None,
    post_map_label: str | None = None,
    fork_id: ForkID | None = None,
    downstream_join_id: JoinID | None = None,
) -> None
```

Add an edge that maps iterable data across parallel paths.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : `Source`\[[`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[`T`\]\]

The source node that produces iterable data

**`map_to`** : `Destination`\[`T`\]

The destination node that receives individual items

**`pre_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label before the map operation

**`post_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label after the map operation

**`fork_id`** : `ForkID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the fork node produced for this map operation

**`downstream_join_id`** : `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable.

##### edge\_from

```python
def edge_from(
    sources: Source[SourceOutputT] = (),
) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]
```

Create an edge path builder starting from the given source nodes.

###### Returns

`EdgePathBuilder`\[`StateT`, `DepsT`, `SourceOutputT`\] -- An EdgePathBuilder for constructing the complete edge path

###### Parameters

**`*sources`** : `Source`\[`SourceOutputT`\] _Default:_ `()`

The source nodes to start the edge path from

##### decision

```python
def decision(
    note: str | None = None,
    node_id: str | None = None,
) -> Decision[StateT, DepsT, Never]
```

Create a new decision node.

###### Returns

`Decision`\[`StateT`, `DepsT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A new Decision node with no branches

###### Parameters

**`note`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional note to describe the decision logic

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node produced for this decision logic

##### match

```python
def match(
    source: TypeOrTypeExpression[SourceT],
    matches: Callable[[Any], bool] | None = None,
) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]
```

Create a decision branch matcher.

###### Returns

`DecisionBranchBuilder`\[`StateT`, `DepsT`, `SourceT`, `SourceT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A DecisionBranchBuilder for constructing the branch

###### Parameters

**`source`** : `TypeOrTypeExpression`\[`SourceT`\]

The type or type expression to match against

**`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom matching function

##### match\_node

```python
def match_node(
    source: type[SourceNodeT],
    matches: Callable[[Any], bool] | None = None,
) -> DecisionBranch[SourceNodeT]
```

Create a decision branch for BaseNode subclasses.

This is similar to match() but specifically designed for matching against BaseNode types from the v1 system.

###### Returns

`DecisionBranch`\[`SourceNodeT`\] -- A DecisionBranch for the BaseNode type

###### Parameters

**`source`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`SourceNodeT`\]

The BaseNode subclass to match against

**`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom matching function

##### node

```python
def node(
    node_type: type[BaseNode[StateT, DepsT, GraphOutputT]],
) -> EdgePath[StateT, DepsT]
```

Create an edge path from a BaseNode class.

This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges.

###### Returns

`EdgePath`\[`StateT`, `DepsT`\] -- An EdgePath representing the node and its connections

###### Parameters

**`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `GraphOutputT`\]\]

The BaseNode subclass to integrate

###### Raises

-   `GraphSetupError` -- If the node type is missing required type hints

##### build

```python
def build(
    validate_graph_structure: bool = True,
) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]
```

Build the final executable graph from the accumulated nodes and edges.

This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance.

###### Returns

`Graph`\[`StateT`, `DepsT`, `GraphInputT`, `GraphOutputT`\] -- A complete Graph instance ready for execution

###### Parameters

**`validate_graph_structure`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

whether to perform validation of the graph structure See the docstring of \_validate\_graph\_structure below for more details.

###### Raises

-   `ValueError` -- If the graph structure is invalid (e.g., join without parent fork)

### Graph

**Bases:** `Generic[StateT, DepsT, InputT, OutputT]`

A complete graph definition ready for execution.

The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution.

#### Attributes

##### name

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### state\_type

The type of the graph state.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\]

##### deps\_type

The type of the dependencies.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`DepsT`\]

##### input\_type

The type of the input data.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`InputT`\]

##### output\_type

The type of the output data.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`OutputT`\]

##### auto\_instrument

Whether to automatically create instrumentation spans.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### nodes

All nodes in the graph indexed by their ID.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, `AnyNode`\]

##### edges\_by\_source

Outgoing paths from each source node.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Path`\]\]

##### parent\_forks

Parent fork information for each join node.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, `ParentFork`\[`NodeID`\]\]

##### intermediate\_join\_nodes

For each join, the set of other joins that appear between it and its parent fork.

Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, [`set`](https://docs.python.org/3/reference/expressions.html#set)\[`JoinID`\]\]

#### Methods

##### get\_parent\_fork

```python
def get_parent_fork(join_id: JoinID) -> ParentFork[NodeID]
```

Get the parent fork information for a join node.

###### Returns

`ParentFork`\[`NodeID`\] -- The parent fork information for the join

###### Parameters

**`join_id`** : `JoinID`

The ID of the join node

###### Raises

-   `RuntimeError` -- If the join ID is not found or has no parent fork

##### is\_final\_join

```python
def is_final_join(join_id: JoinID) -> bool
```

Check if a join is 'final' (has no downstream joins with the same parent fork).

A join is non-final if it appears as an intermediate node for another join with the same parent fork.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) -- True if the join is final, False if it's non-final

###### Parameters

**`join_id`** : `JoinID`

The ID of the join node

##### run

`@async`

```python
def run(
    state: StateT = None,
    deps: DepsT = None,
    inputs: InputT = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> OutputT
```

Execute the graph and return the final output.

This is the main entry point for graph execution. It runs the graph to completion and returns the final output value.

###### Returns

`OutputT` -- The final output from the graph execution

###### Parameters

**`state`** : `StateT` _Default:_ `None`

The graph state instance

**`deps`** : `DepsT` _Default:_ `None`

The dependencies instance

**`inputs`** : `InputT` _Default:_ `None`

The input data for the graph

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional span for tracing/instrumentation

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### iter

`@async`

```python
def iter(
    state: StateT = None,
    deps: DepsT = None,
    inputs: InputT = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]
```

Create an iterator for step-by-step graph execution.

This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `OutputT`\]\]

###### Parameters

**`state`** : `StateT` _Default:_ `None`

The graph state instance

**`deps`** : `DepsT` _Default:_ `None`

The dependencies instance

**`inputs`** : `InputT` _Default:_ `None`

The input data for the graph

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional span for tracing/instrumentation

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### render

```python
def render(
    title: str | None = None,
    direction: StateDiagramDirection | None = None,
) -> str
```

Render the graph as a Mermaid diagram string.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram representation

###### Parameters

**`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional title for the diagram

**`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional direction for the diagram layout

##### \_\_str\_\_

```python
def __str__() -> str
```

Return a Mermaid diagram representation of the graph.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram of the graph

### StepNode

**Bases:** `BaseNode[StateT, DepsT, Any]`

A base node that represents a step with bound inputs.

StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.Step) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps.

#### Attributes

##### step

The step to execute.

**Type:** `Step`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### inputs

The inputs bound to this step.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

#### Methods

##### run

`@async`

```python
def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]
```

Attempt to run the step node.

###### Returns

`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution

###### Parameters

**`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\]

The graph execution context

###### Raises

-   `NotImplementedError` -- Always raised as StepNode is not meant to be run directly

---

# [pydantic_graph.beta.decision](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_decision/)

# pydantic\_graph.beta.decision

Decision node implementation for conditional branching in graph execution.

This module provides the Decision node type and related classes for implementing conditional branching logic in parallel control flow graphs. Decision nodes allow the graph to choose different execution paths based on runtime conditions.

### Decision

**Bases:** `Generic[StateT, DepsT, HandledT]`

Decision node for conditional branching in graph execution.

A Decision node evaluates conditions and routes execution to different branches based on the input data type or custom matching logic.

#### Attributes

##### id

Unique identifier for this decision node.

**Type:** `NodeID`

##### branches

List of branches that can be taken from this decision.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`DecisionBranch`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

##### note

Optional documentation note for this decision.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### branch

```python
def branch(branch: DecisionBranch[T]) -> Decision[StateT, DepsT, HandledT | T]
```

Add a new branch to this decision.

###### Returns

`Decision`\[`StateT`, `DepsT`, `HandledT` | `T`\] -- A new Decision with the additional branch.

###### Parameters

**`branch`** : `DecisionBranch`\[`T`\]

The branch to add to this decision.

### DecisionBranch

**Bases:** `Generic[SourceT]`

Represents a single branch within a decision node.

Each branch defines the conditions under which it should be taken and the path to follow when those conditions are met.

Note: with the current design, it is actually _critical_ that this class is invariant in SourceT for the sake of type-checking that inputs to a Decision are actually handled. See the `# type: ignore` comment in `tests.graph.beta.test_graph_edge_cases.test_decision_no_matching_branch` for an example of how this works.

#### Attributes

##### source

The expected type of data for this branch.

This is necessary for exhaustiveness-checking when handling the inputs to a decision node.

**Type:** `TypeOrTypeExpression`\[`SourceT`\]

##### matches

An optional predicate function used to determine whether input data matches this branch.

If `None`, default logic is used which attempts to check the value for type-compatibility with the `source` type:

-   If `source` is `Any` or `object`, the branch will always match
-   If `source` is a `Literal` type, this branch will match if the value is one of the parametrizing literal values
-   If `source` is any other type, the value will be checked for matching using `isinstance`

Inputs are tested against each branch of a decision node in order, and the path of the first matching branch is used to handle the input value.

**Type:** [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### path

The execution path to follow when an input value matches this branch of a decision node.

This can include transforming, mapping, and broadcasting the output before sending to the next node or nodes.

The path can also include position-aware labels which are used when generating mermaid diagrams.

**Type:** `Path`

##### destinations

The destination nodes that can be referenced by DestinationMarker in the path.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`AnyDestinationNode`\]

### DecisionBranchBuilder

**Bases:** `Generic[StateT, DepsT, OutputT, SourceT, HandledT]`

Builder for constructing decision branches with fluent API.

This builder provides methods to configure branches with destinations, forks, and transformations in a type-safe manner.

Instances of this class should be created using [`GraphBuilder.match`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder), not created directly.

#### Methods

##### to

```python
def to(
    destination: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]],
    extra_destinations: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]] = (),
    fork_id: str | None = None,
) -> DecisionBranch[SourceT]
```

Set the destination(s) for this branch.

###### Returns

`DecisionBranch`\[`SourceT`\] -- A completed DecisionBranch with the specified destinations.

###### Parameters

**`destination`** : `DestinationNode`\[`StateT`, `DepsT`, `OutputT`\] | [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

The primary destination node.

**`*extra_destinations`** : `DestinationNode`\[`StateT`, `DepsT`, `OutputT`\] | [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] _Default:_ `()`

Additional destination nodes.

**`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional node ID to use for the resulting broadcast fork if multiple destinations are provided.

##### broadcast

```python
def broadcast(
    get_forks: Callable[[Self], Sequence[DecisionBranch[SourceT]]],
    fork_id: str | None = None,
) -> DecisionBranch[SourceT]
```

Broadcast this decision branch into multiple destinations.

###### Returns

`DecisionBranch`\[`SourceT`\] -- A completed DecisionBranch with the specified destinations.

###### Parameters

**`get_forks`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\], [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`DecisionBranch`\[`SourceT`\]\]\]

The callback that will return a sequence of decision branches to broadcast to.

**`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional node ID to use for the resulting broadcast fork.

##### transform

```python
def transform(
    func: TransformFunction[StateT, DepsT, OutputT, NewOutputT],
) -> DecisionBranchBuilder[StateT, DepsT, NewOutputT, SourceT, HandledT]
```

Apply a transformation to the branch's output.

###### Returns

`DecisionBranchBuilder`\[`StateT`, `DepsT`, `NewOutputT`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where the provided transform is applied prior to generating the final output.

###### Parameters

**`func`** : `TransformFunction`\[`StateT`, `DepsT`, `OutputT`, `NewOutputT`\]

Transformation function to apply.

##### map

```python
def map(
    fork_id: str | None = None,
    downstream_join_id: str | None = None,
) -> DecisionBranchBuilder[StateT, DepsT, T, SourceT, HandledT]
```

Spread the branch's output.

To do this, the current output must be iterable, and any subsequent steps in the path being built for this branch will be applied to each item of the current output in parallel.

###### Returns

`DecisionBranchBuilder`\[`StateT`, `DepsT`, `T`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where mapping is performed prior to generating the final output.

###### Parameters

**`fork_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the fork, defaults to a generated value

**`downstream_join_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID of a downstream join node which is involved when mapping empty iterables

##### label

```python
def label(
    label: str,
) -> DecisionBranchBuilder[StateT, DepsT, OutputT, SourceT, HandledT]
```

Apply a label to the branch at the current point in the path being built.

These labels are only used in generated mermaid diagrams.

###### Returns

`DecisionBranchBuilder`\[`StateT`, `DepsT`, `OutputT`, `SourceT`, `HandledT`\] -- A new DecisionBranchBuilder where the label has been applied at the end of the current path being built.

###### Parameters

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The label to apply.

### StateT

Type variable for graph state.

**Default:** `TypeVar('StateT', infer_variance=True)`

### DepsT

Type variable for graph dependencies.

**Default:** `TypeVar('DepsT', infer_variance=True)`

### HandledT

Type variable used to track types handled by the branches of a Decision.

**Default:** `TypeVar('HandledT', infer_variance=True)`

### T

Generic type variable.

**Default:** `TypeVar('T', infer_variance=True)`

### SourceT

Type variable for source data for a DecisionBranch.

**Default:** `TypeVar('SourceT', infer_variance=True)`

### OutputT

Type variable for the output data of a node.

**Default:** `TypeVar('OutputT', infer_variance=True)`

### NewOutputT

Type variable for transformed output.

**Default:** `TypeVar('NewOutputT', infer_variance=True)`

---

# [pydantic_graph.beta.graph](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_graph/)

# pydantic\_graph.beta.graph

Core graph execution engine for the next version of the pydantic-graph library.

This module provides the main `Graph` class and `GraphRun` execution engine that handles the orchestration of nodes, edges, and parallel execution paths in the graph-based workflow system.

### EndMarker

**Bases:** `Generic[OutputT]`

A marker indicating the end of graph execution with a final value.

EndMarker is used internally to signal that the graph has completed execution and carries the final output value.

### ErrorMarker

A marker indicating that a graph node raised an exception.

Yielded by the graph iterator instead of raising immediately, allowing the caller to recover by sending new tasks via `GraphRun.next()` or `GraphRun.override_next()`. If the caller does not override, the error is re-raised on the next iteration.

#### Attributes

##### error

The exception raised by the node.

**Type:** [`BaseException`](https://docs.python.org/3/library/exceptions.html#BaseException)

### JoinItem

An item representing data flowing into a join operation.

JoinItem carries input data from a parallel execution path to a join node, along with metadata about which execution 'fork' it originated from.

#### Attributes

##### join\_id

The ID of the join node this item is targeting.

**Type:** `JoinID`

##### inputs

The input data for the join operation.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### fork\_stack

The stack of ForkStackItems that led to producing this join item.

**Type:** `ForkStack`

### Graph

**Bases:** `Generic[StateT, DepsT, InputT, OutputT]`

A complete graph definition ready for execution.

The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution.

#### Attributes

##### name

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### state\_type

The type of the graph state.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\]

##### deps\_type

The type of the dependencies.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`DepsT`\]

##### input\_type

The type of the input data.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`InputT`\]

##### output\_type

The type of the output data.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`OutputT`\]

##### auto\_instrument

Whether to automatically create instrumentation spans.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### nodes

All nodes in the graph indexed by their ID.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, `AnyNode`\]

##### edges\_by\_source

Outgoing paths from each source node.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`NodeID`, [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Path`\]\]

##### parent\_forks

Parent fork information for each join node.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, `ParentFork`\[`NodeID`\]\]

##### intermediate\_join\_nodes

For each join, the set of other joins that appear between it and its parent fork.

Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`JoinID`, [`set`](https://docs.python.org/3/reference/expressions.html#set)\[`JoinID`\]\]

#### Methods

##### get\_parent\_fork

```python
def get_parent_fork(join_id: JoinID) -> ParentFork[NodeID]
```

Get the parent fork information for a join node.

###### Returns

`ParentFork`\[`NodeID`\] -- The parent fork information for the join

###### Parameters

**`join_id`** : `JoinID`

The ID of the join node

###### Raises

-   `RuntimeError` -- If the join ID is not found or has no parent fork

##### is\_final\_join

```python
def is_final_join(join_id: JoinID) -> bool
```

Check if a join is 'final' (has no downstream joins with the same parent fork).

A join is non-final if it appears as an intermediate node for another join with the same parent fork.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool) -- True if the join is final, False if it's non-final

###### Parameters

**`join_id`** : `JoinID`

The ID of the join node

##### run

`@async`

```python
def run(
    state: StateT = None,
    deps: DepsT = None,
    inputs: InputT = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> OutputT
```

Execute the graph and return the final output.

This is the main entry point for graph execution. It runs the graph to completion and returns the final output value.

###### Returns

`OutputT` -- The final output from the graph execution

###### Parameters

**`state`** : `StateT` _Default:_ `None`

The graph state instance

**`deps`** : `DepsT` _Default:_ `None`

The dependencies instance

**`inputs`** : `InputT` _Default:_ `None`

The input data for the graph

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional span for tracing/instrumentation

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### iter

`@async`

```python
def iter(
    state: StateT = None,
    deps: DepsT = None,
    inputs: InputT = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]
```

Create an iterator for step-by-step graph execution.

This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `OutputT`\]\]

###### Parameters

**`state`** : `StateT` _Default:_ `None`

The graph state instance

**`deps`** : `DepsT` _Default:_ `None`

The dependencies instance

**`inputs`** : `InputT` _Default:_ `None`

The input data for the graph

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional span for tracing/instrumentation

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### render

```python
def render(
    title: str | None = None,
    direction: StateDiagramDirection | None = None,
) -> str
```

Render the graph as a Mermaid diagram string.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram representation

###### Parameters

**`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional title for the diagram

**`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional direction for the diagram layout

##### \_\_str\_\_

```python
def __str__() -> str
```

Return a Mermaid diagram representation of the graph.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- A string containing the Mermaid diagram of the graph

### GraphTaskRequest

A request to run a task representing the execution of a node in the graph.

GraphTaskRequest encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within.

#### Attributes

##### node\_id

The ID of the node to execute.

**Type:** `NodeID`

##### inputs

The input data for the node.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

##### fork\_stack

Stack of forks that have been entered.

Used by the GraphRun to decide when to proceed through joins.

**Type:** `ForkStack` **Default:** `field(repr=False)`

### GraphTask

**Bases:** `GraphTaskRequest`

A task representing the execution of a node in the graph.

GraphTask encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within, and has a unique ID to identify the task within the graph run.

#### Attributes

##### task\_id

Unique identifier for this task.

**Type:** `TaskID` **Default:** `field(repr=False)`

### GraphRun

**Bases:** `Generic[StateT, DepsT, OutputT]`

A single execution instance of a graph.

GraphRun manages the execution state for a single run of a graph, including task scheduling, fork/join coordination, and result tracking.

#### Attributes

##### graph

The graph being executed.

**Default:** `graph`

##### state

The graph state instance.

**Default:** `state`

##### deps

The dependencies instance.

**Default:** `deps`

##### inputs

The initial input data.

**Default:** `inputs`

##### next\_task

Get the next task(s) to be executed.

**Type:** `EndMarker`\[`OutputT`\] | `ErrorMarker` | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\]

##### output

Get the final output if the graph has completed.

**Type:** `OutputT` | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    graph: Graph[StateT, DepsT, InputT, OutputT],
    state: StateT,
    deps: DepsT,
    inputs: InputT,
    traceparent: str | None,
)
```

Initialize a graph run.

###### Parameters

**`graph`** : `Graph`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]

The graph to execute

**`state`** : `StateT`

The graph state instance

**`deps`** : `DepsT`

The dependencies instance

**`inputs`** : `InputT`

The input data for the graph

**`traceparent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

Optional trace parent for instrumentation

##### \_\_aiter\_\_

```python
def __aiter__() -> AsyncIterator[EndMarker[OutputT] | Sequence[GraphTask]]
```

Return self as an async iterator.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\]\] -- Self for async iteration

##### \_\_anext\_\_

`@async`

```python
def __anext__() -> EndMarker[OutputT] | Sequence[GraphTask]
```

Get the next item in the async iteration.

###### Returns

`EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\] -- The next execution result from the graph

###### Raises

-   `Exception` -- If a node raised an error and the caller has not recovered via `override_next()`.

##### next

`@async`

```python
def next(
    value: EndMarker[OutputT] | Sequence[GraphTaskRequest] | None = None,
) -> EndMarker[OutputT] | Sequence[GraphTask]
```

Advance the graph execution by one step.

This method allows for sending a value to the iterator, which is useful for resuming iteration or overriding intermediate results.

###### Returns

`EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTask`\] -- The next execution result: either an EndMarker, or sequence of GraphTasks

###### Parameters

**`value`** : `EndMarker`\[`OutputT`\] | [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTaskRequest`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional value to send to the iterator

##### override\_next

```python
def override_next(value: Sequence[GraphTaskRequest] | EndMarker[OutputT]) -> None
```

Override the next pending step, allowing the graph to continue after an `End` or error.

This is used by hook systems (like `after_node_run` or `on_node_run_error`) to redirect the graph to a new node when the current step produced an `End` result or raised an error, or to signal early completion by passing an `EndMarker`.

Must only be called between iterations (not while an iteration is in flight).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`value`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`GraphTaskRequest`\] | `EndMarker`\[`OutputT`\]

New task requests to execute next, or an `EndMarker` to signal completion.

### StateT

Type variable for graph state.

**Default:** `TypeVar('StateT', infer_variance=True)`

### DepsT

Type variable for graph dependencies.

**Default:** `TypeVar('DepsT', infer_variance=True)`

### InputT

Type variable for graph inputs.

**Default:** `TypeVar('InputT', infer_variance=True)`

### OutputT

Type variable for graph outputs.

**Default:** `TypeVar('OutputT', infer_variance=True)`

---

# [pydantic_graph.beta.graph_builder](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_graph_builder/)

# pydantic\_graph.beta.graph\_builder

Graph builder for constructing executable graph definitions.

This module provides the GraphBuilder class and related utilities for constructing typed, executable graph definitions with steps, joins, decisions, and edge routing.

### GraphBuilder

**Bases:** `Generic[StateT, DepsT, GraphInputT, GraphOutputT]`

A builder for constructing executable graph definitions.

GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation.

#### Attributes

##### name

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `name`

##### state\_type

The type of the graph state.

**Type:** `TypeOrTypeExpression`\[`StateT`\] **Default:** `state_type`

##### deps\_type

The type of the dependencies.

**Type:** `TypeOrTypeExpression`\[`DepsT`\] **Default:** `deps_type`

##### input\_type

The type of the graph input data.

**Type:** `TypeOrTypeExpression`\[`GraphInputT`\] **Default:** `input_type`

##### output\_type

The type of the graph output data.

**Type:** `TypeOrTypeExpression`\[`GraphOutputT`\] **Default:** `output_type`

##### auto\_instrument

Whether to automatically create instrumentation spans.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `auto_instrument`

##### start\_node

Get the start node for the graph.

**Type:** `StartNode`\[`GraphInputT`\]

##### end\_node

Get the end node for the graph.

**Type:** `EndNode`\[`GraphOutputT`\]

#### Methods

##### \_\_init\_\_

```python
def __init__(
    name: str | None = None,
    state_type: TypeOrTypeExpression[StateT] = NoneType,
    deps_type: TypeOrTypeExpression[DepsT] = NoneType,
    input_type: TypeOrTypeExpression[GraphInputT] = NoneType,
    output_type: TypeOrTypeExpression[GraphOutputT] = NoneType,
    auto_instrument: bool = True,
)
```

Initialize a graph builder.

###### Parameters

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**`state_type`** : `TypeOrTypeExpression`\[`StateT`\] _Default:_ `NoneType`

The type of the graph state

**`deps_type`** : `TypeOrTypeExpression`\[`DepsT`\] _Default:_ `NoneType`

The type of the dependencies

**`input_type`** : `TypeOrTypeExpression`\[`GraphInputT`\] _Default:_ `NoneType`

The type of the graph input data

**`output_type`** : `TypeOrTypeExpression`\[`GraphOutputT`\] _Default:_ `NoneType`

The type of the graph output data

**`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to automatically create instrumentation spans

##### step

```python
def step(
    node_id: str | None = None,
    label: str | None = None,
) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]]
def step(
    call: StepFunction[StateT, DepsT, InputT, OutputT],
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, OutputT]
```

Create a step from a step function.

This method can be used as a decorator or called directly to create a step node from an async function.

###### Returns

`Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\] -- Either a Step instance or a decorator function

###### Parameters

**`call`** : `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The step function to wrap

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional human-readable label

##### stream

```python
def stream(
    node_id: str | None = None,
    label: str | None = None,
) -> Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]]
def stream(
    call: StreamFunction[StateT, DepsT, InputT, OutputT],
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]
def stream(
    call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None,
    node_id: str | None = None,
    label: str | None = None,
) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[[StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]]
```

Create a step from an async iterator (which functions like a "stream").

This method can be used as a decorator or called directly to create a step node from an async function.

###### Returns

`Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\] | [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[`StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]\], `Step`\[`StateT`, `DepsT`, `InputT`, [`AsyncIterable`](https://docs.python.org/3/library/typing.html#typing.AsyncIterable)\[`OutputT`\]\]\] -- Either a Step instance or a decorator function

###### Parameters

**`call`** : `StreamFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The step function to wrap

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional human-readable label

##### add

```python
def add(edges: EdgePath[StateT, DepsT] = ()) -> None
```

Add one or more edge paths to the graph.

This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`*edges`** : `EdgePath`\[`StateT`, `DepsT`\] _Default:_ `()`

The edge paths to add to the graph

##### add\_edge

```python
def add_edge(
    source: Source[T],
    destination: Destination[T],
    label: str | None = None,
) -> None
```

Add a simple edge between two nodes.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : `Source`\[`T`\]

The source node

**`destination`** : `Destination`\[`T`\]

The destination node

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label for the edge

##### add\_mapping\_edge

```python
def add_mapping_edge(
    source: Source[Iterable[T]],
    map_to: Destination[T],
    pre_map_label: str | None = None,
    post_map_label: str | None = None,
    fork_id: ForkID | None = None,
    downstream_join_id: JoinID | None = None,
) -> None
```

Add an edge that maps iterable data across parallel paths.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`source`** : `Source`\[[`Iterable`](https://docs.python.org/3/library/typing.html#typing.Iterable)\[`T`\]\]

The source node that produces iterable data

**`map_to`** : `Destination`\[`T`\]

The destination node that receives individual items

**`pre_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label before the map operation

**`post_map_label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional label after the map operation

**`fork_id`** : `ForkID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the fork node produced for this map operation

**`downstream_join_id`** : `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable.

##### edge\_from

```python
def edge_from(
    sources: Source[SourceOutputT] = (),
) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]
```

Create an edge path builder starting from the given source nodes.

###### Returns

`EdgePathBuilder`\[`StateT`, `DepsT`, `SourceOutputT`\] -- An EdgePathBuilder for constructing the complete edge path

###### Parameters

**`*sources`** : `Source`\[`SourceOutputT`\] _Default:_ `()`

The source nodes to start the edge path from

##### decision

```python
def decision(
    note: str | None = None,
    node_id: str | None = None,
) -> Decision[StateT, DepsT, Never]
```

Create a new decision node.

###### Returns

`Decision`\[`StateT`, `DepsT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A new Decision node with no branches

###### Parameters

**`note`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional note to describe the decision logic

**`node_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional ID for the node produced for this decision logic

##### match

```python
def match(
    source: TypeOrTypeExpression[SourceT],
    matches: Callable[[Any], bool] | None = None,
) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]
```

Create a decision branch matcher.

###### Returns

`DecisionBranchBuilder`\[`StateT`, `DepsT`, `SourceT`, `SourceT`, [`Never`](https://docs.python.org/3/library/typing.html#typing.Never)\] -- A DecisionBranchBuilder for constructing the branch

###### Parameters

**`source`** : `TypeOrTypeExpression`\[`SourceT`\]

The type or type expression to match against

**`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom matching function

##### match\_node

```python
def match_node(
    source: type[SourceNodeT],
    matches: Callable[[Any], bool] | None = None,
) -> DecisionBranch[SourceNodeT]
```

Create a decision branch for BaseNode subclasses.

This is similar to match() but specifically designed for matching against BaseNode types from the v1 system.

###### Returns

`DecisionBranch`\[`SourceNodeT`\] -- A DecisionBranch for the BaseNode type

###### Parameters

**`source`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`SourceNodeT`\]

The BaseNode subclass to match against

**`matches`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\], [`bool`](https://docs.python.org/3/library/functions.html#bool)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom matching function

##### node

```python
def node(
    node_type: type[BaseNode[StateT, DepsT, GraphOutputT]],
) -> EdgePath[StateT, DepsT]
```

Create an edge path from a BaseNode class.

This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges.

###### Returns

`EdgePath`\[`StateT`, `DepsT`\] -- An EdgePath representing the node and its connections

###### Parameters

**`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `GraphOutputT`\]\]

The BaseNode subclass to integrate

###### Raises

-   `GraphSetupError` -- If the node type is missing required type hints

##### build

```python
def build(
    validate_graph_structure: bool = True,
) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]
```

Build the final executable graph from the accumulated nodes and edges.

This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance.

###### Returns

`Graph`\[`StateT`, `DepsT`, `GraphInputT`, `GraphOutputT`\] -- A complete Graph instance ready for execution

###### Parameters

**`validate_graph_structure`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

whether to perform validation of the graph structure See the docstring of \_validate\_graph\_structure below for more details.

###### Raises

-   `ValueError` -- If the graph structure is invalid (e.g., join without parent fork)

---

# [pydantic_graph.beta.join](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_join/)

# pydantic\_graph.beta.join

Join operations and reducers for graph execution.

This module provides the core components for joining parallel execution paths in a graph, including various reducer types that aggregate data from multiple sources into a single output.

### JoinState

The state of a join during graph execution associated to a particular fork run.

### ReducerContext

**Bases:** `Generic[StateT, DepsT]`

Context information passed to reducer functions during graph execution.

The reducer context provides access to the current graph state and dependencies.

#### Attributes

##### state

The state of the graph run.

**Type:** `StateT`

##### deps

The deps for the graph run.

**Type:** `DepsT`

#### Methods

##### cancel\_sibling\_tasks

```python
def cancel_sibling_tasks()
```

Cancel all sibling tasks created from the same fork.

You can call this if you want your join to have early-stopping behavior.

### SupportsSum

**Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol)

A protocol for a type that supports adding to itself.

### ReduceFirstValue

**Bases:** `Generic[T]`

A reducer that returns the first value it encounters, and cancels all other tasks.

#### Methods

##### \_\_call\_\_

```python
def __call__(ctx: ReducerContext[object, object], current: T, inputs: T) -> T
```

The reducer function.

###### Returns

`T`

### Join

**Bases:** `Generic[StateT, DepsT, InputT, OutputT]`

A join operation that synchronizes and aggregates parallel execution paths.

A join defines how to combine outputs from multiple parallel execution paths using a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction). It specifies which fork it joins (if any) and manages the initialization of reducers.

#### Methods

##### as\_node

```python
def as_node(inputs: None = None) -> JoinNode[StateT, DepsT]
def as_node(inputs: InputT) -> JoinNode[StateT, DepsT]
```

Create a step node with bound inputs.

###### Returns

`JoinNode`\[`StateT`, `DepsT`\] -- A [`StepNode`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepNode) with this step and the bound inputs

###### Parameters

**`inputs`** : `InputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The input data to bind to this step, or None

### JoinNode

**Bases:** `BaseNode[StateT, DepsT, Any]`

A base node that represents a join item with bound inputs.

JoinNode bridges between the v1 and v2 graph execution systems by wrapping a [`Join`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.Join) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps.

#### Attributes

##### join

The step to execute.

**Type:** `Join`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### inputs

The inputs bound to this step.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

#### Methods

##### run

`@async`

```python
def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]
```

Attempt to run the join node.

###### Returns

`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution

###### Parameters

**`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\]

The graph execution context

###### Raises

-   `NotImplementedError` -- Always raised as StepNode is not meant to be run directly

### reduce\_null

```python
def reduce_null(current: None, inputs: Any) -> None
```

A reducer that discards all input data and returns None.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### reduce\_list\_append

```python
def reduce_list_append(current: list[T], inputs: T) -> list[T]
```

A reducer that appends to a list.

#### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`T`\]

### reduce\_list\_extend

```python
def reduce_list_extend(current: list[T], inputs: Iterable[T]) -> list[T]
```

A reducer that extends a list.

#### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`T`\]

### reduce\_dict\_update

```python
def reduce_dict_update(current: dict[K, V], inputs: Mapping[K, V]) -> dict[K, V]
```

A reducer that updates a dict.

#### Returns

[`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[`K`, `V`\]

### reduce\_sum

```python
def reduce_sum(current: NumericT, inputs: NumericT) -> NumericT
```

A reducer that sums numbers.

#### Returns

`NumericT`

### ReducerFunction

A function used for reducing inputs to a join node.

**Default:** `TypeAliasType('ReducerFunction', ContextReducerFunction[StateT, DepsT, InputT, OutputT] | PlainReducerFunction[InputT, OutputT], type_params=(StateT, DepsT, InputT, OutputT))`

---

# [pydantic_graph.beta.node](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_node/)

# pydantic\_graph.beta.node

Core node types for graph construction and execution.

This module defines the fundamental node types used to build execution graphs, including start/end nodes and fork nodes for parallel execution.

### StartNode

**Bases:** `Generic[OutputT]`

Entry point node for graph execution.

The StartNode represents the beginning of a graph execution flow.

#### Attributes

##### id

Fixed identifier for the start node.

**Default:** `NodeID('__start__')`

### EndNode

**Bases:** `Generic[InputT]`

Terminal node representing the completion of graph execution.

The EndNode marks the successful completion of a graph execution flow and can collect the final output data.

#### Attributes

##### id

Fixed identifier for the end node.

**Default:** `NodeID('__end__')`

### Fork

**Bases:** `Generic[InputT, OutputT]`

Fork node that creates parallel execution branches.

A Fork node splits the execution flow into multiple parallel branches, enabling concurrent execution of downstream nodes. It can either map a sequence across multiple branches or duplicate data to each branch.

#### Attributes

##### id

Unique identifier for this fork node.

**Type:** `ForkID`

##### is\_map

Determines fork behavior.

If True, InputT must be Sequence\[OutputT\] and each element is sent to a separate branch. If False, InputT must be OutputT and the same data is sent to all branches.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### downstream\_join\_id

Optional identifier of a downstream join node that should be jumped to if mapping an empty iterable.

**Type:** `JoinID` | [`None`](https://docs.python.org/3/library/constants.html#None)

### StateT

Type variable for graph state.

**Default:** `TypeVar('StateT', infer_variance=True)`

### OutputT

Type variable for node output data.

**Default:** `TypeVar('OutputT', infer_variance=True)`

### InputT

Type variable for node input data.

**Default:** `TypeVar('InputT', infer_variance=True)`

---

# [pydantic_graph.beta.step](https://pydantic.dev/docs/ai/api/pydantic_graph/beta_step/)

# pydantic\_graph.beta.step

Step-based graph execution components.

This module provides the core abstractions for step-based graph execution, including step contexts, step functions, and step nodes that bridge between the v1 and v2 graph execution systems.

### StepContext

**Bases:** `Generic[StateT, DepsT, InputT]`

Context information passed to step functions during graph execution.

The step context provides access to the current graph state, dependencies, and input data for a step.

#### Attributes

##### inputs

The input data for this step.

This must be a property to ensure correct variance behavior

**Type:** `InputT`

### StepFunction

**Bases:** `Protocol[StateT, DepsT, InputT, OutputT]`

Protocol for step functions that can be executed in the graph.

Step functions are async callables that receive a step context and return a result.

#### Methods

##### \_\_call\_\_

```python
def __call__(ctx: StepContext[StateT, DepsT, InputT]) -> Awaitable[OutputT]
```

Execute the step function with the given context.

###### Returns

[`Awaitable`](https://docs.python.org/3/library/typing.html#typing.Awaitable)\[`OutputT`\] -- An awaitable that resolves to the step's output

###### Parameters

**`ctx`** : `StepContext`\[`StateT`, `DepsT`, `InputT`\]

The step context containing state, dependencies, and inputs

### StreamFunction

**Bases:** `Protocol[StateT, DepsT, InputT, OutputT]`

Protocol for stream functions that can be executed in the graph.

Stream functions are async callables that receive a step context and return an async iterator.

#### Methods

##### \_\_call\_\_

```python
def __call__(ctx: StepContext[StateT, DepsT, InputT]) -> AsyncIterator[OutputT]
```

Execute the stream function with the given context.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`OutputT`\] -- An async iterator yielding the streamed output

###### Parameters

**`ctx`** : `StepContext`\[`StateT`, `DepsT`, `InputT`\]

The step context containing state, dependencies, and inputs

### Step

**Bases:** `Generic[StateT, DepsT, InputT, OutputT]`

A step in the graph execution that wraps a step function.

Steps represent individual units of execution in the graph, encapsulating a step function along with metadata like ID and label.

#### Attributes

##### id

Unique identifier for this step.

**Type:** `NodeID` **Default:** `id`

##### label

Optional human-readable label for this step.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `label`

##### call

The step function to execute. This needs to be a property for proper variance inference.

**Type:** `StepFunction`\[`StateT`, `DepsT`, `InputT`, `OutputT`\]

#### Methods

##### as\_node

```python
def as_node(inputs: None = None) -> StepNode[StateT, DepsT]
def as_node(inputs: InputT) -> StepNode[StateT, DepsT]
```

Create a step node with bound inputs.

###### Returns

`StepNode`\[`StateT`, `DepsT`\] -- A [`StepNode`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepNode) with this step and the bound inputs

###### Parameters

**`inputs`** : `InputT` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The input data to bind to this step, or None

### StepNode

**Bases:** `BaseNode[StateT, DepsT, Any]`

A base node that represents a step with bound inputs.

StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.Step) with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps.

#### Attributes

##### step

The step to execute.

**Type:** `Step`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

##### inputs

The inputs bound to this step.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

#### Methods

##### run

`@async`

```python
def run(ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]
```

Attempt to run the step node.

###### Returns

`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] -- The result of step execution

###### Parameters

**`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\]

The graph execution context

###### Raises

-   `NotImplementedError` -- Always raised as StepNode is not meant to be run directly

### NodeStep

**Bases:** `Step[StateT, DepsT, Any, BaseNode[StateT, DepsT, Any] | End[Any]]`

A step that wraps a BaseNode type for execution.

NodeStep allows v1-style BaseNode classes to be used as steps in the v2 graph execution system. It validates that the input is of the expected node type and runs it with the appropriate graph context.

#### Attributes

##### node\_type

The BaseNode type this step executes.

**Type:** [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] **Default:** `get_origin(node_type) or node_type`

#### Methods

##### \_\_init\_\_

```python
def __init__(
    node_type: type[BaseNode[StateT, DepsT, Any]],
    id: NodeID | None = None,
    label: str | None = None,
)
```

Initialize a node step.

###### Parameters

**`node_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\]

The BaseNode class this step will execute

**`id`** : `NodeID` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional unique identifier, defaults to the node's get\_node\_id()

**`label`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional human-readable label for this step

### AnyStepFunction

Type alias for a step function with any type parameters.

**Default:** `StepFunction[Any, Any, Any, Any]`

---

# [pydantic_graph.exceptions](https://pydantic.dev/docs/ai/api/pydantic_graph/exceptions/)

# pydantic\_graph.exceptions

### GraphSetupError

**Bases:** [`TypeError`](https://docs.python.org/3/library/exceptions.html#TypeError)

Error caused by an incorrectly configured graph.

#### Attributes

##### message

Description of the mistake.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### GraphBuildingError

**Bases:** [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError)

An error raised during graph-building.

#### Attributes

##### message

The error message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### GraphValidationError

**Bases:** [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError)

An error raised during graph validation.

#### Attributes

##### message

The error message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### GraphRuntimeError

**Bases:** [`RuntimeError`](https://docs.python.org/3/library/exceptions.html#RuntimeError)

Error caused by an issue during graph execution.

#### Attributes

##### message

The error message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `message`

### GraphNodeStatusError

**Bases:** `GraphRuntimeError`

Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`.

#### Methods

##### check

`@classmethod`

```python
def check(cls, status: SnapshotStatus) -> None
```

Check if the status is valid.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

---

# [pydantic_graph](https://pydantic.dev/docs/ai/api/pydantic_graph/graph/)

# pydantic\_graph

### Graph

**Bases:** `Generic[StateT, DepsT, RunEndT]`

Definition of a graph.

In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges -- e.g. which nodes may be run next, and thereby the structure of the graph.

Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end.

never\_42.py

```py
from __future__ import annotations

from dataclasses import dataclass

from pydantic_graph import BaseNode, End, Graph, GraphRunContext

@dataclass
class MyState:
    number: int

@dataclass
class Increment(BaseNode[MyState]):
    async def run(self, ctx: GraphRunContext) -> Check42:
        ctx.state.number += 1
        return Check42()

@dataclass
class Check42(BaseNode[MyState, None, int]):
    async def run(self, ctx: GraphRunContext) -> Increment | End[int]:
        if ctx.state.number == 42:
            return Increment()
        else:
            return End(ctx.state.number)

never_42_graph = Graph(nodes=(Increment, Check42))
```

_(This example is complete, it can be run "as is")_

See [`run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) For an example of running graph, and [`mermaid_code`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) for an example of generating a mermaid diagram from the graph.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]],
    name: str | None = None,
    state_type: type[StateT] | _utils.Unset = _utils.UNSET,
    run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET,
    auto_instrument: bool = True,
)
```

Create a graph from a sequence of nodes.

###### Parameters

**`nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]\]\]

The nodes which make up the graph, nodes need to be unique and all be generic in the same state type.

**`name`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.

**`state_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The type of the state for the graph, this can generally be inferred from `nodes`.

**`run_end_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RunEndT`\] | `_utils.Unset` _Default:_ `_utils.UNSET`

The type of the result of running the graph, this can generally be inferred from `nodes`.

**`auto_instrument`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to create a span for the graph run and the execution of each node's run method.

##### run

`@async`

```python
def run(
    start_node: BaseNode[StateT, DepsT, RunEndT],
    state: StateT = None,
    deps: DepsT = None,
    persistence: BaseStatePersistence[StateT, RunEndT] | None = None,
    infer_name: bool = True,
) -> GraphRunResult[StateT, RunEndT]
```

Run the graph from a starting node until it ends.

Here's an example of running the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph):

run\_never\_42.py

```py
from never_42 import Increment, MyState, never_42_graph

async def main():
    state = MyState(1)
    await never_42_graph.run(Increment(), state=state)
    print(state)
    #> MyState(number=2)

    state = MyState(41)
    await never_42_graph.run(Increment(), state=state)
    print(state)
    #> MyState(number=43)
```

###### Returns

`GraphRunResult`\[`StateT`, `RunEndT`\] -- A `GraphRunResult` containing information about the run, including its final result.

###### Parameters

**`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]

the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node.

**`state`** : `StateT` _Default:_ `None`

The initial state of the graph.

**`deps`** : `DepsT` _Default:_ `None`

The dependencies of the graph.

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### run\_sync

```python
def run_sync(
    start_node: BaseNode[StateT, DepsT, RunEndT],
    state: StateT = None,
    deps: DepsT = None,
    persistence: BaseStatePersistence[StateT, RunEndT] | None = None,
    infer_name: bool = True,
) -> GraphRunResult[StateT, RunEndT]
```

Synchronously run the graph.

This is a convenience method that wraps [`self.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop.

###### Returns

`GraphRunResult`\[`StateT`, `RunEndT`\] -- The result type from ending the run and the history of the run.

###### Parameters

**`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]

the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node.

**`state`** : `StateT` _Default:_ `None`

The initial state of the graph.

**`deps`** : `DepsT` _Default:_ `None`

The dependencies of the graph.

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### iter

`@async`

```python
def iter(
    start_node: BaseNode[StateT, DepsT, RunEndT],
    state: StateT = None,
    deps: DepsT = None,
    persistence: BaseStatePersistence[StateT, RunEndT] | None = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]
```

A contextmanager which can be used to iterate over the graph's nodes as they are executed.

This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds.

The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next).

The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed.

For more details, see the API documentation of [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun).

Returns: A GraphRun that can be async iterated over to drive the graph to completion.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `RunEndT`\]\]

###### Parameters

**`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]

the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node.

**`state`** : `StateT` _Default:_ `None`

The initial state of the graph.

**`deps`** : `DepsT` _Default:_ `None`

The dependencies of the graph.

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

State persistence interface, defaults to [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) if `None`.

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The span to use for the graph run. If not provided, a new span will be created.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### iter\_from\_persistence

`@async`

```python
def iter_from_persistence(
    persistence: BaseStatePersistence[StateT, RunEndT],
    deps: DepsT = None,
    span: AbstractContextManager[AbstractSpan] | None = None,
    infer_name: bool = True,
) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]
```

A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object.

This method has similar functionality to [`iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter), but instead of passing the node to run, it will restore the node and state from state persistence.

Returns: A GraphRun that can be async iterated over to drive the graph to completion.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`GraphRun`\[`StateT`, `DepsT`, `RunEndT`\]\]

###### Parameters

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\]

The state persistence interface to use.

**`deps`** : `DepsT` _Default:_ `None`

The dependencies of the graph.

**`span`** : `AbstractContextManager`\[`AbstractSpan`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The span to use for the graph run. If not provided, a new span will be created.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### initialize

`@async`

```python
def initialize(
    node: BaseNode[StateT, DepsT, RunEndT],
    persistence: BaseStatePersistence[StateT, RunEndT],
    state: StateT = None,
    infer_name: bool = True,
) -> None
```

Initialize a new graph run in persistence without running it.

This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]

The node to run first.

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\]

State persistence interface.

**`state`** : `StateT` _Default:_ `None`

The start state of the graph.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

##### mermaid\_code

```python
def mermaid_code(
    start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None,
    title: str | None | typing_extensions.Literal[False] = None,
    edge_labels: bool = True,
    notes: bool = True,
    highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None,
    highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS,
    infer_name: bool = True,
    direction: mermaid.StateDiagramDirection | None = None,
) -> str
```

Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram.

This method calls [`pydantic_graph.mermaid.generate_code`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.generate_code).

Here's an example of generating a diagram for the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph):

mermaid\_never\_42.py

```py
from never_42 import Increment, never_42_graph

print(never_42_graph.mermaid_code(start_node=Increment))
'''
---
title: never_42_graph
---
stateDiagram-v2
  [*] --> Increment
  Increment --> Check42
  Check42 --> Increment
  Check42 --> [*]
'''
```

The rendered diagram will look like this:

```mermaid
---
title: never_42_graph
---
stateDiagram-v2
  [*] --> Increment
  Increment --> Check42
  Check42 --> Increment
  Check42 --> [*]
```

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- The mermaid code for the graph, which can then be rendered as a diagram.

###### Parameters

**`start_node`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`mermaid.NodeIdent`\] | `mermaid.NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The node or nodes which can start the graph.

**`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) | [`typing_extensions.Literal`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Literal)\[[`False`](https://docs.python.org/3/library/constants.html#False)\] _Default:_ `None`

The title of the diagram, use `False` to not include a title.

**`edge_labels`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include edge labels.

**`notes`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include notes on each node.

**`highlighted_nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`mermaid.NodeIdent`\] | `mermaid.NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional node or nodes to highlight.

**`highlight_css`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `mermaid.DEFAULT_HIGHLIGHT_CSS`

The CSS to use for highlighting nodes.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

**`direction`** : `mermaid.StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The direction of flow.

##### mermaid\_image

```python
def mermaid_image(
    infer_name: bool = True,
    kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] = {},
) -> bytes
```

Generate a diagram representing the graph as an image.

The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.MermaidConfig).

Uses external service

This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- The image bytes.

###### Parameters

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

**`**kwargs`** : [`typing_extensions.Unpack`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Unpack)\[`mermaid.MermaidConfig`\] _Default:_ `{}`

Additional arguments to pass to `mermaid.request_image`.

##### mermaid\_save

```python
def mermaid_save(
    path: Path | str,
    infer_name: bool = True,
    kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] = {},
) -> None
```

Generate a diagram representing the graph and save it as an image.

The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`](/docs/ai/api/pydantic_graph/mermaid/#pydantic_graph.mermaid.MermaidConfig).

Uses external service

This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The path to save the image to.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to infer the graph name from the calling frame.

**`**kwargs`** : [`typing_extensions.Unpack`](https://typing-extensions.readthedocs.io/en/latest/index.html#typing_extensions.Unpack)\[`mermaid.MermaidConfig`\] _Default:_ `{}`

Additional arguments to pass to `mermaid.save_image`.

##### get\_nodes

```python
def get_nodes() -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]
```

Get the nodes in the graph.

###### Returns

[`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`type`](https://docs.python.org/3/glossary.html#term-type)\[`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]\]\]

### GraphRun

**Bases:** `Generic[StateT, DepsT, RunEndT]`

A stateful, async-iterable run of a [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph).

You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`.

Here's an example of iterating over the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph):

iter\_never\_42.py

```py
from copy import deepcopy
from never_42 import Increment, MyState, never_42_graph

async def main():
    state = MyState(1)
    async with never_42_graph.iter(Increment(), state=state) as graph_run:
        node_states = [(graph_run.next_node, deepcopy(graph_run.state))]
        async for node in graph_run:
            node_states.append((node, deepcopy(graph_run.state)))
        print(node_states)
        '''
        [
            (Increment(), MyState(number=1)),
            (Increment(), MyState(number=1)),
            (Check42(), MyState(number=2)),
            (End(data=2), MyState(number=2)),
        ]
        '''

    state = MyState(41)
    async with never_42_graph.iter(Increment(), state=state) as graph_run:
        node_states = [(graph_run.next_node, deepcopy(graph_run.state))]
        async for node in graph_run:
            node_states.append((node, deepcopy(graph_run.state)))
        print(node_states)
        '''
        [
            (Increment(), MyState(number=41)),
            (Increment(), MyState(number=41)),
            (Check42(), MyState(number=42)),
            (Increment(), MyState(number=42)),
            (Check42(), MyState(number=43)),
            (End(data=43), MyState(number=43)),
        ]
        '''
```

See the [`GraphRun.next` documentation](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) for an example of how to manually drive the graph run.

#### Attributes

##### next\_node

The next node that will be run in the graph.

This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`.

**Type:** `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\]

##### result

The final result of the graph run if the run is completed, otherwise `None`.

**Type:** `GraphRunResult`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### \_\_init\_\_

```python
def __init__(
    graph: Graph[StateT, DepsT, RunEndT],
    start_node: BaseNode[StateT, DepsT, RunEndT],
    persistence: BaseStatePersistence[StateT, RunEndT],
    state: StateT,
    deps: DepsT,
    traceparent: str | None,
    snapshot_id: str | None = None,
)
```

Create a new run for a given graph, starting at the specified node.

Typically, you'll use [`Graph.iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) rather than calling this directly.

###### Parameters

**`graph`** : `Graph`\[`StateT`, `DepsT`, `RunEndT`\]

The [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) to run.

**`start_node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\]

The node where execution will begin.

**`persistence`** : `BaseStatePersistence`\[`StateT`, `RunEndT`\]

State persistence interface.

**`state`** : `StateT`

A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`.

**`deps`** : `DepsT`

Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients.

**`traceparent`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

The traceparent for the span used for the graph run.

**`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The ID of the snapshot the node came from.

##### next

`@async`

```python
def next(
    node: BaseNode[StateT, DepsT, RunEndT] | None = None,
) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]
```

Manually drive the graph run by passing in the node you want to run next.

This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node.

Here's an example of using `next` to drive the graph from [above](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph):

next\_never\_42.py

```py
from copy import deepcopy
from pydantic_graph import End
from never_42 import Increment, MyState, never_42_graph

async def main():
    state = MyState(48)
    async with never_42_graph.iter(Increment(), state=state) as graph_run:
        next_node = graph_run.next_node  # start with the first node
        node_states = [(next_node, deepcopy(graph_run.state))]

        while not isinstance(next_node, End):
            if graph_run.state.number == 50:
                graph_run.state.number = 42
            next_node = await graph_run.next(next_node)
            node_states.append((next_node, deepcopy(graph_run.state)))

        print(node_states)
        '''
        [
            (Increment(), MyState(number=48)),
            (Check42(), MyState(number=49)),
            (End(data=49), MyState(number=49)),
        ]
        '''
```

###### Returns

`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] -- The next node returned by the graph logic, or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) node if `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\] -- the run has completed.

###### Parameters

**`node`** : `BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned.

##### \_\_anext\_\_

`@async`

```python
def __anext__() -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]
```

Use the last returned node as the input to `Graph.next`.

###### Returns

`BaseNode`\[`StateT`, `DepsT`, `RunEndT`\] | `End`\[`RunEndT`\]

### GraphRunResult

**Bases:** `Generic[StateT, RunEndT]`

The final result of running a graph.

---

# [pydantic_graph.mermaid](https://pydantic.dev/docs/ai/api/pydantic_graph/mermaid/)

# pydantic\_graph.mermaid

### MermaidConfig

**Bases:** [`TypedDict`](https://docs.python.org/3/library/typing.html#typing.TypedDict)

Parameters to configure mermaid chart generation.

#### Attributes

##### start\_node

Identifiers of nodes that start the graph.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent`

##### highlighted\_nodes

Identifiers of nodes to highlight.

**Type:** [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent`

##### highlight\_css

CSS to use for highlighting nodes.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### title

The title of the diagram.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### edge\_labels

Whether to include edge labels in the diagram.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### notes

Whether to include notes on nodes in the diagram, defaults to true.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### image\_type

The image type to generate. If unspecified, the default behavior is `'jpeg'`.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['jpeg', 'png', 'webp', 'svg', 'pdf'\]

##### pdf\_fit

When using image\_type='pdf', whether to fit the diagram to the PDF page.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### pdf\_landscape

When using image\_type='pdf', whether to use landscape orientation for the PDF.

This has no effect if using `pdf_fit`.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### pdf\_paper

When using image\_type='pdf', the paper size of the PDF.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['letter', 'legal', 'tabloid', 'ledger', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'\]

##### background\_color

The background color of the diagram.

If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### theme

The theme of the diagram. Defaults to 'default'.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['default', 'neutral', 'dark', 'forest'\]

##### width

The width of the diagram.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### height

The height of the diagram.

**Type:** [`int`](https://docs.python.org/3/library/functions.html#int)

##### scale

The scale of the diagram.

The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[[`float`](https://docs.python.org/3/library/functions.html#float), `Ge`(1), `Le`(3)\]

##### httpx\_client

An HTTPX client to use for requests, mostly for testing purposes.

**Type:** `httpx.Client`

##### direction

The direction of the state diagram.

**Type:** `StateDiagramDirection`

### generate\_code

```python
def generate_code(
    graph: Graph[Any, Any, Any],
    start_node: Sequence[NodeIdent] | NodeIdent | None = None,
    highlighted_nodes: Sequence[NodeIdent] | NodeIdent | None = None,
    highlight_css: str = DEFAULT_HIGHLIGHT_CSS,
    title: str | None = None,
    edge_labels: bool = True,
    notes: bool = True,
    direction: StateDiagramDirection | None,
) -> str
```

Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph.

#### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) -- The Mermaid code for the graph.

#### Parameters

**`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The graph to generate the image for.

**`start_node`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Identifiers of nodes that start the graph.

**`highlighted_nodes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`NodeIdent`\] | `NodeIdent` | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Identifiers of nodes to highlight.

**`highlight_css`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_HIGHLIGHT_CSS`

CSS to use for highlighting nodes.

**`title`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

The title of the diagram.

**`edge_labels`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include edge labels in the diagram.

**`notes`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to include notes in the diagram.

**`direction`** : `StateDiagramDirection` | [`None`](https://docs.python.org/3/library/constants.html#None)

The direction of flow.

### request\_image

```python
def request_image(
    graph: Graph[Any, Any, Any],
    kwargs: Unpack[MermaidConfig] = {},
) -> bytes
```

Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink).

#### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) -- The image data.

#### Parameters

**`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The graph to generate the image for.

**`**kwargs`** : [`Unpack`](https://docs.python.org/3/library/typing.html#typing.Unpack)\[`MermaidConfig`\] _Default:_ `{}`

Additional parameters to configure mermaid chart generation.

### save\_image

```python
def save_image(
    path: Path | str,
    graph: Graph[Any, Any, Any],
    kwargs: Unpack[MermaidConfig] = {},
) -> None
```

Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file.

#### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

#### Parameters

**`path`** : `Path` | [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The path to save the image to.

**`graph`** : `Graph`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]

The graph to generate the image for.

**`**kwargs`** : [`Unpack`](https://docs.python.org/3/library/typing.html#typing.Unpack)\[`MermaidConfig`\] _Default:_ `{}`

Additional parameters to configure mermaid chart generation.

### DEFAULT\_HIGHLIGHT\_CSS

The default CSS to use for highlighting nodes.

**Default:** `'fill:#fdff32'`

### StateDiagramDirection

Used to specify the direction of the state diagram generated by mermaid.

-   `'TB'`: Top to bottom, this is the default for mermaid charts.
-   `'LR'`: Left to right
-   `'RL'`: Right to left
-   `'BT'`: Bottom to top

**Default:** `Literal['TB', 'LR', 'RL', 'BT']`

### NodeIdent

A type alias for a node identifier.

This can be:

-   A node instance (instance of a subclass of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode)).
-   A node class (subclass of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode)).
-   A string representing the node ID.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `'type[BaseNode[Any, Any, Any]] | BaseNode[Any, Any, Any] | str'`

---

# [pydantic_graph.nodes](https://pydantic.dev/docs/ai/api/pydantic_graph/nodes/)

# pydantic\_graph.nodes

### GraphRunContext

**Bases:** `Generic[StateT, DepsT]`

Context for a graph.

#### Attributes

##### state

The state of the graph.

**Type:** `StateT`

##### deps

Dependencies for the graph.

**Type:** `DepsT`

### BaseNode

**Bases:** `ABC`, `Generic[StateT, DepsT, NodeRunEndT]`

Base class for a node.

#### Attributes

##### docstring\_notes

Set to `True` to generate mermaid diagram notes from the class's docstring.

While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the [`get_note`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.get_note) method.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

#### Methods

##### run

`@abstractmethod`

`@async`

```python
def run(
    ctx: GraphRunContext[StateT, DepsT],
) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]
```

Run the node.

This is an abstract method that must be implemented by subclasses.

Return types used at runtime

The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](/docs/ai/api/pydantic_graph/mermaid) and enforced when running the graph.

###### Returns

`BaseNode`\[`StateT`, `DepsT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | `End`\[`NodeRunEndT`\] -- The next node to run or [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) to signal the end of the graph.

###### Parameters

**`ctx`** : `GraphRunContext`\[`StateT`, `DepsT`\]

The graph context.

##### get\_node\_id

`@cached`

`@classmethod`

```python
def get_node_id(cls) -> str
```

Get the ID of the node.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### get\_note

`@classmethod`

```python
def get_note(cls) -> str | None
```

Get a note about the node to render on mermaid charts.

By default, this returns a note only if [`docstring_notes`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.docstring_notes) is `True`. You can override this method to customise the node notes.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### get\_node\_def

`@classmethod`

```python
def get_node_def(
    cls,
    local_ns: dict[str, Any] | None,
) -> NodeDef[StateT, DepsT, NodeRunEndT]
```

Get the node definition.

###### Returns

`NodeDef`\[`StateT`, `DepsT`, `NodeRunEndT`\]

##### deep\_copy

```python
def deep_copy() -> Self
```

Returns a deep copy of the node.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

### End

**Bases:** `Generic[RunEndT]`

Type to return from a node to signal the end of the graph.

#### Attributes

##### data

Data to return from the graph.

**Type:** `RunEndT`

#### Methods

##### deep\_copy\_data

```python
def deep_copy_data() -> End[RunEndT]
```

Returns a deep copy of the end of the run.

###### Returns

`End`\[`RunEndT`\]

### Edge

Annotation to apply a label to an edge in a graph.

#### Attributes

##### label

Label for the edge.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

### StateT

Type variable for the state in a graph.

**Default:** `TypeVar('StateT', default=None)`

### DepsT

Type variable for the dependencies of a graph and node.

**Default:** `TypeVar('DepsT', default=None, contravariant=True)`

### RunEndT

Covariant type variable for the return type of a graph [`run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run).

**Default:** `TypeVar('RunEndT', covariant=True, default=None)`

### NodeRunEndT

Covariant type variable for the return type of a node [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run).

**Default:** `TypeVar('NodeRunEndT', covariant=True, default=Never)`

---

# [pydantic_graph.persistence](https://pydantic.dev/docs/ai/api/pydantic_graph/persistence/)

# pydantic\_graph.persistence

### NodeSnapshot

**Bases:** `Generic[StateT, RunEndT]`

History step describing the execution of a node in a graph.

#### Attributes

##### state

The state of the graph before the node is run.

**Type:** `StateT`

##### node

The node to run next.

**Type:** [`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)\[`BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\], `_utils.CustomNodeSchema`()\]

##### start\_ts

The timestamp when the node started running, `None` until the run starts.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### duration

The duration of the node run in seconds, if the node has been run.

**Type:** [`float`](https://docs.python.org/3/library/functions.html#float) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### status

The status of the snapshot.

**Type:** `SnapshotStatus` **Default:** `'created'`

##### kind

The kind of history step, can be used as a discriminator when deserializing history.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['node'\] **Default:** `'node'`

##### id

Unique ID of the snapshot.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `UNSET_SNAPSHOT_ID`

### EndSnapshot

**Bases:** `Generic[StateT, RunEndT]`

History step describing the end of a graph run.

#### Attributes

##### state

The state of the graph at the end of the run.

**Type:** `StateT`

##### result

The result of the graph run.

**Type:** `End`\[`RunEndT`\]

##### ts

The timestamp when the graph run ended.

**Type:** [`datetime`](https://docs.python.org/3/library/datetime.html#module-datetime) **Default:** `field(default_factory=(_utils.now_utc))`

##### kind

The kind of history step, can be used as a discriminator when deserializing history.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['end'\] **Default:** `'end'`

##### id

Unique ID of the snapshot.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `UNSET_SNAPSHOT_ID`

##### node

Shim to get the [`result`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot.result).

Useful to allow `[snapshot.node for snapshot in persistence.history]`.

**Type:** `End`\[`RunEndT`\]

### BaseStatePersistence

**Bases:** `ABC`, `Generic[StateT, RunEndT]`

Abstract base class for storing the state of a graph run.

Each instance of a `BaseStatePersistence` subclass should be used for a single graph run.

#### Methods

##### snapshot\_node

`@abstractmethod`

`@async`

```python
def snapshot_node(state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None
```

Snapshot the state of a graph, when the next step is to run a node.

This method should add a [`NodeSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot) to persistence.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`state`** : `StateT`

The state of the graph.

**`next_node`** : `BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\]

The next node to run.

##### snapshot\_node\_if\_new

`@abstractmethod`

`@async`

```python
def snapshot_node_if_new(
    snapshot_id: str,
    state: StateT,
    next_node: BaseNode[StateT, Any, RunEndT],
) -> None
```

Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence.

This method will generally call [`snapshot_node`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.snapshot_node) but should do so in an atomic way.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The ID of the snapshot to check.

**`state`** : `StateT`

The state of the graph.

**`next_node`** : `BaseNode`\[`StateT`, [`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `RunEndT`\]

The next node to run.

##### snapshot\_end

`@abstractmethod`

`@async`

```python
def snapshot_end(state: StateT, end: End[RunEndT]) -> None
```

Snapshot the state of a graph when the graph has ended.

This method should add an [`EndSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot) to persistence.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`state`** : `StateT`

The state of the graph.

**`end`** : `End`\[`RunEndT`\]

data from the end of the run.

##### record\_run

`@abstractmethod`

```python
def record_run(snapshot_id: str) -> AbstractAsyncContextManager[None]
```

Record the run of the node, or error if the node is already running.

In particular this should set:

-   [`NodeSnapshot.status`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.status) to `'running'` and [`NodeSnapshot.start_ts`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.start_ts) when the run starts.
-   [`NodeSnapshot.status`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.status) to `'success'` or `'error'` and [`NodeSnapshot.duration`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.duration) when the run finishes.

###### Returns

`AbstractAsyncContextManager`\[[`None`](https://docs.python.org/3/library/constants.html#None)\] -- An async context manager that records the run of the node.

###### Parameters

**`snapshot_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str)

The ID of the snapshot to record.

###### Raises

-   `GraphNodeRunningError` -- if the node status is not `'created'` or `'pending'`.
-   `LookupError` -- if the snapshot ID is not found in persistence.

##### load\_next

`@abstractmethod`

`@async`

```python
def load_next() -> NodeSnapshot[StateT, RunEndT] | None
```

Retrieve a node snapshot with status `'created'` and set its status to `'pending'`.

This is used by [`Graph.iter_from_persistence`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) to get the next node to run.

Returns: The snapshot, or `None` if no snapshot with status `'created'` exists.

###### Returns

`NodeSnapshot`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### load\_all

`@abstractmethod`

`@async`

```python
def load_all() -> list[Snapshot[StateT, RunEndT]]
```

Load the entire history of snapshots.

`load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.Snapshot) from persistence.

Returns: The list of snapshots.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\]

##### set\_graph\_types

```python
def set_graph_types(graph: Graph[StateT, Any, RunEndT]) -> None
```

Set the types of the state and run end from a graph.

You generally won't need to customise this method, instead implement [`set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_types) and [`should_set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.should_set_types).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

##### should\_set\_types

```python
def should_set_types() -> bool
```

Whether types need to be set.

Implementations should override this method to return `True` when types have not been set if they are needed.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

##### set\_types

```python
def set_types(state_type: type[StateT], run_end_type: type[RunEndT]) -> None
```

Set the types of the state and run end.

This can be used to create [type adapters](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter) for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.build_snapshot_list_type_adapter).

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`state_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`StateT`\]

The state type.

**`run_end_type`** : [`type`](https://docs.python.org/3/glossary.html#term-type)\[`RunEndT`\]

The run end type.

### build\_snapshot\_list\_type\_adapter

```python
def build_snapshot_list_type_adapter(
    state_t: type[StateT],
    run_end_t: type[RunEndT],
) -> pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]]
```

Build a type adapter for a list of snapshots.

This method should be called from within [`set_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_types) where context variables will be set such that Pydantic can create a schema for [`NodeSnapshot.node`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot.node).

#### Returns

[`pydantic.TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter)\[[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\]\]

### SnapshotStatus

The status of a snapshot.

-   `'created'`: The snapshot has been created but not yet run.
-   `'pending'`: The snapshot has been retrieved with [`load_next`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_next) but not yet run.
-   `'running'`: The snapshot is currently running.
-   `'success'`: The snapshot has been run successfully.
-   `'error'`: The snapshot has been run but an error occurred.

**Default:** `Literal['created', 'pending', 'running', 'success', 'error']`

### Snapshot

A step in the history of a graph run.

[`Graph.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) returns a list of these steps describing the execution of the graph, together with the run return value.

**Default:** `NodeSnapshot[StateT, RunEndT] | EndSnapshot[StateT, RunEndT]`

In memory state persistence.

This module provides simple in memory state persistence for graphs.

### SimpleStatePersistence

**Bases:** `BaseStatePersistence[StateT, RunEndT]`

Simple in memory state persistence that just hold the latest snapshot.

If no state persistence implementation is provided when running a graph, this is used by default.

#### Attributes

##### last\_snapshot

The last snapshot.

**Type:** `Snapshot`\[`StateT`, `RunEndT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### FullStatePersistence

**Bases:** `BaseStatePersistence[StateT, RunEndT]`

In memory state persistence that hold a list of snapshots.

#### Attributes

##### deep\_copy

Whether to deep copy the state and nodes when storing them.

Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `True`

##### history

List of snapshots taken during the graph run.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`Snapshot`\[`StateT`, `RunEndT`\]\] **Default:** `field(default_factory=(list[Snapshot[StateT, RunEndT]]))`

#### Methods

##### dump\_json

```python
def dump_json(indent: int | None = None) -> bytes
```

Dump the history to JSON bytes.

###### Returns

[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes)

##### load\_json

```python
def load_json(json_data: str | bytes | bytearray) -> None
```

Load the history from JSON.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### FileStatePersistence

**Bases:** `BaseStatePersistence[StateT, RunEndT]`

File based state persistence that hold graph run state in a JSON file.

#### Attributes

##### json\_file

Path to the JSON file where the snapshots are stored.

You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run.

For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus:

```py
from pathlib import Path

from pydantic_graph import FullStatePersistence

run_id = 'run_123abc'
persistence = FullStatePersistence(Path('runs') / f'{run_id}.json')
```

**Type:** `Path`

#### Methods

##### should\_set\_types

```python
def should_set_types() -> bool
```

Whether types need to be set.

###### Returns

[`bool`](https://docs.python.org/3/library/functions.html#bool)

---

# [pydantic_ai.ui.ag_ui](https://pydantic.dev/docs/ai/api/ui/ag_ui/)

# pydantic\_ai.ui.ag\_ui

AG-UI protocol integration for Pydantic AI agents.

### AGUIEventStream

**Bases:** `UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]`

UI event stream transformer for the Agent-User Interaction (AG-UI) protocol.

#### Methods

##### handle\_event

`@async`

```python
def handle_event(event: NativeEvent) -> AsyncIterator[BaseEvent]
```

Override to set timestamps on all AG-UI events.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`BaseEvent`\]

### AGUIAdapter

**Bases:** `UIAdapter[RunAgentInput, Message, BaseEvent, AgentDepsT, OutputDataT]`

UI adapter for the Agent-User Interaction (AG-UI) protocol.

#### Attributes

##### ag\_ui\_version

AG-UI protocol version controlling behavior thresholds.

Accepts any version string (e.g. `'0.1.13'`). Defaults to the version detected from the installed `ag-ui-protocol` package.

Known thresholds:

-   `< 0.1.13`: emits `THINKING_*` events during streaming, drops `ThinkingPart` from `dump_messages` output.
-   `>= 0.1.13`: emits `REASONING_*` events with encrypted metadata during streaming, and includes `ThinkingPart` as `ReasoningMessage` in `dump_messages` output for full round-trip fidelity of thinking signatures and provider metadata.
-   `>= 0.1.15`: emits typed multimodal input content (`ImageInputContent`, `AudioInputContent`, `VideoInputContent`, `DocumentInputContent`) instead of generic `BinaryInputContent`.

`load_messages` always accepts `ReasoningMessage` and multimodal content types regardless of this setting.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `DEFAULT_AG_UI_VERSION`

##### preserve\_file\_data

Whether to preserve agent-generated files and uploaded files in AG-UI message conversion.

When `True`, agent-generated files and uploaded files are stored as [activity messages](https://docs.ag-ui.com/concepts/activities) during `dump_messages` and restored during `load_messages`, enabling full round-trip fidelity. When `False` (default), they are silently dropped.

If your AG-UI frontend uses activities, be aware that `pydantic_ai_*` activity types are reserved for internal round-trip use and should be ignored by frontend activity handlers.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool) **Default:** `False`

##### messages

Pydantic AI messages from the AG-UI run input.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### toolset

Toolset representing frontend tools from the AG-UI run input.

**Type:** [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### state

Frontend state from the AG-UI run input.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### conversation\_id

Conversation ID from the AG-UI `RunAgentInput.threadId`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### build\_run\_input

`@classmethod`

```python
def build_run_input(cls, body: bytes) -> RunAgentInput
```

Build an AG-UI run input object from the request body.

###### Returns

`RunAgentInput`

##### build\_event\_stream

```python
def build_event_stream(

) -> UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]
```

Build an AG-UI event stream transformer.

###### Returns

`UIEventStream`\[`RunAgentInput`, `BaseEvent`, `AgentDepsT`, `OutputDataT`\]

##### from\_request

`@async`

`@classmethod`

```python
def from_request(
    cls,
    request: Request,
    agent: AbstractAgent[AgentDepsT, OutputDataT],
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
    kwargs: Any = {},
) -> AGUIAdapter[AgentDepsT, OutputDataT]
```

Extends [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) with AG-UI-specific parameters.

###### Returns

`AGUIAdapter`\[`AgentDepsT`, `OutputDataT`\]

##### load\_messages

`@classmethod`

```python
def load_messages(
    cls,
    messages: Sequence[Message],
    preserve_file_data: bool = False,
) -> list[ModelMessage]
```

Transform AG-UI messages into Pydantic AI messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### dump\_messages

`@classmethod`

```python
def dump_messages(
    cls,
    messages: Sequence[ModelMessage],
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
) -> list[Message]
```

Transform Pydantic AI messages into AG-UI messages.

Note: The round-trip `dump_messages` -> `load_messages` is not fully lossless:

-   `TextPart.id`, `.provider_name`, `.provider_details` are lost.
-   `ToolCallPart.id`, `.provider_name`, `.provider_details` are lost.
-   `BuiltinToolCallPart.id`, `.provider_details` are lost (only `.provider_name` survives via the prefixed tool call ID).
-   `BuiltinToolReturnPart.provider_details` is lost.
-   `RetryPromptPart` becomes `ToolReturnPart` (or `UserPromptPart`) on reload.
-   `CachePoint` and `UploadedFile` content items are dropped (unless `preserve_file_data=True`).
-   `ThinkingPart` is dropped when `ag_ui_version='0.1.10'`.
-   `FilePart` is silently dropped unless `preserve_file_data=True`.
-   `UploadedFile` in a multi-item `UserPromptPart` is split into a separate activity message when `preserve_file_data=True`, which reloads as a separate `UserPromptPart`.
-   Part ordering within a `ModelResponse` may change when text follows tool calls.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`Message`\] -- A list of AG-UI Message objects.

###### Parameters

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

A sequence of ModelMessage objects to convert.

**`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION`

AG-UI protocol version controlling `ThinkingPart` emission.

**`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to include `FilePart` and `UploadedFile` as `ActivityMessage`.

### DEFAULT\_AG\_UI\_VERSION

The default AG-UI version, auto-detected from the installed `ag-ui-protocol` package.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `detect_ag_ui_version()`

AG-UI protocol integration for Pydantic AI agents.

### AGUIApp

**Bases:** `Generic[AgentDepsT, OutputDataT]`, `Starlette`

ASGI application for running Pydantic AI agents with AG-UI protocol support.

#### Methods

##### \_\_init\_\_

```python
def __init__(
    agent: AbstractAgent[AgentDepsT, OutputDataT],
    ag_ui_version: str = DEFAULT_AG_UI_VERSION,
    preserve_file_data: bool = False,
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    on_complete: OnCompleteFunc[Any] | None = None,
    debug: bool = False,
    routes: Sequence[BaseRoute] | None = None,
    middleware: Sequence[Middleware] | None = None,
    exception_handlers: Mapping[Any, ExceptionHandler] | None = None,
    on_startup: Sequence[Callable[[], Any]] | None = None,
    on_shutdown: Sequence[Callable[[], Any]] | None = None,
    lifespan: Lifespan[Self] | None = None,
) -> None
```

An ASGI application that handles every request by running the agent and streaming the response.

Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use `AGUIAdapter.run_stream()` or `AGUIAdapter.dispatch_request()` instead.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

###### Parameters

**`agent`** : `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

The agent to run.

**`ag_ui_version`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) _Default:_ `DEFAULT_AG_UI_VERSION`

AG-UI protocol version controlling thinking/reasoning event format.

**`preserve_file_data`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether to preserve agent-generated files and uploaded files in AG-UI message conversion. See [`AGUIAdapter.preserve_file_data`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.preserve_file_data).

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools for this run.

**`on_complete`** : `OnCompleteFunc`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can access `all_messages()` and other result data.

**`debug`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Boolean indicating if debug tracebacks should be returned on errors.

**`routes`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`BaseRoute`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of routes to serve incoming HTTP and WebSocket requests.

**`middleware`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`Middleware`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints.

**`exception_handlers`** : [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any), `ExceptionHandler`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions.

**`on_startup`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions.

**`on_shutdown`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[\], [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions.

**`lifespan`** : `Lifespan`\[[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both.

---

# [pydantic_ai.ui](https://pydantic.dev/docs/ai/api/ui/base/)

# pydantic\_ai.ui

### MessagesBuilder

Helper class to build Pydantic AI messages from request/response parts.

#### Methods

##### add

```python
def add(part: ModelRequestPart | ModelResponsePart) -> None
```

Add a new part, creating a new request or response message if necessary.

###### Returns

[`None`](https://docs.python.org/3/library/constants.html#None)

### StateHandler

**Bases:** [`Protocol`](https://docs.python.org/3/library/typing.html#typing.Protocol)

Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.

#### Attributes

##### state

Get the current state of the agent run.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)

### UIEventStream

**Bases:** `ABC`, `Generic[RunInputT, EventT, AgentDepsT, OutputDataT]`

Base class for UI event stream transformers.

This class is responsible for transforming Pydantic AI events into protocol-specific events.

#### Attributes

##### accept

The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### message\_id

The message ID to use for the next event.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) **Default:** `field(default_factory=(lambda: str(uuid4())))`

##### response\_headers

Response headers to return to the frontend.

**Type:** [`Mapping`](https://docs.python.org/3/library/typing.html#typing.Mapping)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### content\_type

Get the content type for the event stream, compatible with the `Accept` header value.

By default, this returns the Server-Sent Events content type (`text/event-stream`). If a subclass supports other types as well, it should consider `self.accept` in [`encode_event()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.encode_event) and return the resulting content type.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

#### Methods

##### new\_message\_id

```python
def new_message_id() -> str
```

Generate and store a new message ID.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### encode\_event

`@abstractmethod`

```python
def encode_event(event: EventT) -> str
```

Encode a protocol-specific event as a string.

###### Returns

[`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### encode\_stream

`@async`

```python
def encode_stream(stream: AsyncIterator[EventT]) -> AsyncIterator[str]
```

Encode a stream of protocol-specific events as strings according to the `Accept` header value.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

##### streaming\_response

```python
def streaming_response(stream: AsyncIterator[EventT]) -> StreamingResponse
```

Generate a streaming response from a stream of protocol-specific events.

###### Returns

`StreamingResponse`

##### transform\_stream

`@async`

```python
def transform_stream(
    stream: AsyncIterator[NativeEvent],
    on_complete: OnCompleteFunc[EventT] | None = None,
) -> AsyncIterator[EventT]
```

Transform a stream of Pydantic AI events into protocol-specific events.

This method dispatches to specific hooks and `handle_*` methods that subclasses can override:

-   [`before_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_stream)
-   [`after_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_stream)
-   [`on_error()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.on_error)
-   [`before_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_request)
-   [`after_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_request)
-   [`before_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.before_response)
-   [`after_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.after_response)
-   [`handle_event()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_event)

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\]

The stream of Pydantic AI events to transform.

**`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.

##### handle\_event

`@async`

```python
def handle_event(event: NativeEvent) -> AsyncIterator[EventT]
```

Transform a Pydantic AI event into one or more protocol-specific events.

This method dispatches to specific `handle_*` methods based on event type:

-   [`PartStartEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartStartEvent) -> [`handle_part_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_part_start)
-   [`PartDeltaEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartDeltaEvent) -> `handle_part_delta`
-   [`PartEndEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartEndEvent) -> `handle_part_end`
-   [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent) -> `handle_final_result`
-   [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent) -> `handle_function_tool_call`
-   [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent) -> `handle_function_tool_result`
-   [`OutputToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolCallEvent) -> `handle_output_tool_call`
-   [`OutputToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolResultEvent) -> `handle_output_tool_result`
-   [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) -> `handle_run_result`

Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all events, make sure you call the super method.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### handle\_part\_start

`@async`

```python
def handle_part_start(event: PartStartEvent) -> AsyncIterator[EventT]
```

Handle a `PartStartEvent`.

This method dispatches to specific `handle_*` methods based on part type:

-   [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -> [`handle_text_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_start)
-   [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) -> [`handle_thinking_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_start)
-   [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) -> [`handle_tool_call_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_start)
-   [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) -> [`handle_builtin_tool_call_start()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_start)
-   [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) -> [`handle_builtin_tool_return()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_return)
-   [`FilePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FilePart) -> [`handle_file()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_file)
-   [`CompactionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CompactionPart) -> [`handle_compaction()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_compaction)

Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all part start events, make sure you call the super method.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`PartStartEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartStartEvent)

The part start event.

##### handle\_part\_delta

`@async`

```python
def handle_part_delta(event: PartDeltaEvent) -> AsyncIterator[EventT]
```

Handle a PartDeltaEvent.

This method dispatches to specific `handle_*_delta` methods based on part delta type:

-   [`TextPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPartDelta) -> [`handle_text_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_delta)
-   [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta) -> [`handle_thinking_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_delta)
-   [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta) -> [`handle_tool_call_delta()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_delta)

Subclasses are encouraged to override the individual `handle_*_delta` methods rather than this one. If you need specific behavior for all part delta events, make sure you call the super method.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`PartDeltaEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartDeltaEvent)

The PartDeltaEvent.

##### handle\_part\_end

`@async`

```python
def handle_part_end(event: PartEndEvent) -> AsyncIterator[EventT]
```

Handle a `PartEndEvent`.

This method dispatches to specific `handle_*_end` methods based on part type:

-   [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart) -> [`handle_text_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_text_end)
-   [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart) -> [`handle_thinking_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_thinking_end)
-   [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) -> [`handle_tool_call_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_tool_call_end)
-   [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) -> [`handle_builtin_tool_call_end()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_end)

Subclasses are encouraged to override the individual `handle_*_end` methods rather than this one. If you need specific behavior for all part end events, make sure you call the super method.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`PartEndEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.PartEndEvent)

The part end event.

##### before\_stream

`@async`

```python
def before_stream() -> AsyncIterator[EventT]
```

Yield events before agent streaming starts.

This hook is called before any agent events are processed. Override this to inject custom events at the start of the stream.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### after\_stream

`@async`

```python
def after_stream() -> AsyncIterator[EventT]
```

Yield events after agent streaming completes.

This hook is called after all agent events have been processed. Override this to inject custom events at the end of the stream.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### on\_error

`@async`

```python
def on_error(error: Exception) -> AsyncIterator[EventT]
```

Handle errors that occur during streaming.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`error`** : [`Exception`](https://docs.python.org/3/library/exceptions.html#Exception)

The error that occurred during streaming.

##### before\_request

`@async`

```python
def before_request() -> AsyncIterator[EventT]
```

Yield events before a model request is processed.

Override this to inject custom events at the start of the request.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### after\_request

`@async`

```python
def after_request() -> AsyncIterator[EventT]
```

Yield events after a model request is processed.

Override this to inject custom events at the end of the request.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### before\_response

`@async`

```python
def before_response() -> AsyncIterator[EventT]
```

Yield events before a model response is processed.

Override this to inject custom events at the start of the response.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### after\_response

`@async`

```python
def after_response() -> AsyncIterator[EventT]
```

Yield events after a model response is processed.

Override this to inject custom events at the end of the response.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

##### handle\_text\_start

`@async`

```python
def handle_text_start(
    part: TextPart,
    follows_text: bool = False,
) -> AsyncIterator[EventT]
```

Handle the start of a `TextPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart)

The text part.

**`follows_text`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the part is directly preceded by another text part. In this case, you may want to yield a "text-delta" event instead of a "text-start" event.

##### handle\_text\_delta

`@async`

```python
def handle_text_delta(delta: TextPartDelta) -> AsyncIterator[EventT]
```

Handle a `TextPartDelta`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`delta`** : [`TextPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPartDelta)

The text part delta.

##### handle\_text\_end

`@async`

```python
def handle_text_end(
    part: TextPart,
    followed_by_text: bool = False,
) -> AsyncIterator[EventT]
```

Handle the end of a `TextPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`TextPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.TextPart)

The text part.

**`followed_by_text`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the part is directly followed by another text part. In this case, you may not want to yield a "text-end" event yet.

##### handle\_thinking\_start

`@async`

```python
def handle_thinking_start(
    part: ThinkingPart,
    follows_thinking: bool = False,
) -> AsyncIterator[EventT]
```

Handle the start of a `ThinkingPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart)

The thinking part.

**`follows_thinking`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the part is directly preceded by another thinking part. In this case, you may want to yield a "thinking-delta" event instead of a "thinking-start" event.

##### handle\_thinking\_delta

`@async`

```python
def handle_thinking_delta(delta: ThinkingPartDelta) -> AsyncIterator[EventT]
```

Handle a `ThinkingPartDelta`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`delta`** : [`ThinkingPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPartDelta)

The thinking part delta.

##### handle\_thinking\_end

`@async`

```python
def handle_thinking_end(
    part: ThinkingPart,
    followed_by_thinking: bool = False,
) -> AsyncIterator[EventT]
```

Handle the end of a `ThinkingPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`ThinkingPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ThinkingPart)

The thinking part.

**`followed_by_thinking`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `False`

Whether the part is directly followed by another thinking part. In this case, you may not want to yield a "thinking-end" event yet.

##### handle\_tool\_call\_start

`@async`

```python
def handle_tool_call_start(part: ToolCallPart) -> AsyncIterator[EventT]
```

Handle the start of a `ToolCallPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)

The tool call part.

##### handle\_tool\_call\_delta

`@async`

```python
def handle_tool_call_delta(delta: ToolCallPartDelta) -> AsyncIterator[EventT]
```

Handle a `ToolCallPartDelta`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`delta`** : [`ToolCallPartDelta`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPartDelta)

The tool call part delta.

##### handle\_tool\_call\_end

`@async`

```python
def handle_tool_call_end(part: ToolCallPart) -> AsyncIterator[EventT]
```

Handle the end of a `ToolCallPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)

The tool call part.

##### handle\_builtin\_tool\_call\_start

`@async`

```python
def handle_builtin_tool_call_start(part: BuiltinToolCallPart) -> AsyncIterator[EventT]
```

Handle a `BuiltinToolCallPart` at start.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart)

The builtin tool call part.

##### handle\_builtin\_tool\_call\_end

`@async`

```python
def handle_builtin_tool_call_end(part: BuiltinToolCallPart) -> AsyncIterator[EventT]
```

Handle the end of a `BuiltinToolCallPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart)

The builtin tool call part.

##### handle\_builtin\_tool\_return

`@async`

```python
def handle_builtin_tool_return(part: BuiltinToolReturnPart) -> AsyncIterator[EventT]
```

Handle a `BuiltinToolReturnPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart)

The builtin tool return part.

##### handle\_file

`@async`

```python
def handle_file(part: FilePart) -> AsyncIterator[EventT]
```

Handle a `FilePart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`FilePart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FilePart)

The file part.

##### handle\_compaction

`@async`

```python
def handle_compaction(part: CompactionPart) -> AsyncIterator[EventT]
```

Handle a `CompactionPart`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`part`** : [`CompactionPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CompactionPart)

The compaction part.

##### handle\_final\_result

`@async`

```python
def handle_final_result(event: FinalResultEvent) -> AsyncIterator[EventT]
```

Handle a `FinalResultEvent`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`FinalResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FinalResultEvent)

The final result event.

##### handle\_function\_tool\_call

`@async`

```python
def handle_function_tool_call(event: FunctionToolCallEvent) -> AsyncIterator[EventT]
```

Handle a `FunctionToolCallEvent`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent)

The function tool call event.

##### handle\_function\_tool\_result

`@async`

```python
def handle_function_tool_result(event: FunctionToolResultEvent) -> AsyncIterator[EventT]
```

Handle a `FunctionToolResultEvent`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent)

The function tool result event.

##### handle\_output\_tool\_call

`@async`

```python
def handle_output_tool_call(event: OutputToolCallEvent) -> AsyncIterator[EventT]
```

Handle an `OutputToolCallEvent` (the model's "submit final answer" call).

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`OutputToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolCallEvent)

The output tool call event.

##### handle\_output\_tool\_result

`@async`

```python
def handle_output_tool_result(event: OutputToolResultEvent) -> AsyncIterator[EventT]
```

Handle an `OutputToolResultEvent` (the result of an output tool call).

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`OutputToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolResultEvent)

The output tool result event.

##### handle\_run\_result

`@async`

```python
def handle_run_result(event: AgentRunResultEvent) -> AsyncIterator[EventT]
```

Handle an `AgentRunResultEvent`.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`event`** : [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent)

The agent run result event.

### StateDeps

**Bases:** `Generic[StateT]`

Dependency type that holds state.

This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`.

The state is set using the `state` setter by the `Adapter` when the run starts.

Implements the `StateHandler` protocol.

### UIAdapter

**Bases:** `ABC`, `Generic[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]`

Base class for UI adapters.

This class is responsible for transforming agent run input received from the frontend into arguments for `Agent.run_stream_events()`, running the agent, and then transforming Pydantic AI events into protocol-specific events.

The event stream transformation is handled by a protocol-specific [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) subclass.

#### Attributes

##### agent

The Pydantic AI agent to run.

**Type:** `AbstractAgent`\[`AgentDepsT`, `OutputDataT`\]

##### run\_input

The protocol-specific run input object.

**Type:** `RunInputT`

##### accept

The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### manage\_system\_prompt

Who owns the system prompt.

Only affects `system_prompt` -- [`instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions) are always injected by the agent on every request regardless of this setting.

`'server'` (default): the agent's configured `system_prompt` is authoritative. Any `SystemPromptPart` sent by the frontend is stripped with a warning (since a malicious client could otherwise inject arbitrary instructions via crafted API requests), and the agent's own system prompt is reinjected at the head of the first request via the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability.

`'client'`: the frontend owns the system prompt. Frontend `SystemPromptPart`s are preserved as-is, and the agent's configured `system_prompt` is not injected -- the caller is fully responsible for sending it on every turn if desired. To opt into the same fallback-to-configured behavior as server mode, add the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability to your agent.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['server', 'client'\] **Default:** `'server'`

##### allowed\_file\_url\_schemes

URL schemes that are allowed for [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) parts ([`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl), [`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl), [`VideoUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.VideoUrl), [`AudioUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AudioUrl)) in client-submitted messages.

Defaults to `{'http', 'https'}`. Parts whose URL scheme is not in this set are dropped with a warning before the messages are passed to the agent.

Non-HTTP schemes like `s3://` (Bedrock) or `gs://` (Vertex AI) cause the model provider to fetch the object using the server-side IAM role or service account, so a client that can supply arbitrary URLs can read anything that identity can reach. HTTPS URLs are safe to forward because the provider fetches them with its own public credentials, and the library's own `download_item` path applies SSRF protection when it has to download them itself.

For uploads initiated in the browser, prefer pre-signed `https://` URLs over cloud-storage schemes. To opt into a cloud-storage scheme after auditing your frontend, add it to this set, e.g. `frozenset({'http', 'https', 's3'})`.

**Type:** [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] **Default:** `frozenset({'http', 'https'})`

##### messages

Pydantic AI messages from the protocol-specific run input.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### toolset

Toolset representing frontend tools from the protocol-specific run input.

**Type:** [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### state

Frontend state from the protocol-specific run input.

**Type:** [`dict`](https://docs.python.org/3/reference/expressions.html#dict)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str), [`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None)

##### deferred\_tool\_results

Deferred tool results extracted from the request, used for tool approval workflows.

**Type:** [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### conversation\_id

Conversation ID extracted from the protocol-specific run input.

Used to correlate multiple agent runs that share message history. Returned as the `gen_ai.conversation.id` OpenTelemetry span attribute on each run.

Subclasses for protocols that carry a conversation/thread/chat ID should override this (e.g. AG-UI's `RunAgentInput.threadId`, Vercel AI's top-level chat `id`).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### from\_request

`@async`

`@classmethod`

```python
def from_request(
    cls,
    request: Request,
    agent: AbstractAgent[AgentDepsT, OutputDataT],
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
    kwargs: Any = {},
) -> Self
```

Create an adapter from a request.

Extra keyword arguments are forwarded to the adapter constructor, allowing subclasses to accept additional adapter-specific parameters.

###### Returns

[`Self`](https://docs.python.org/3/library/typing.html#typing.Self)

##### build\_run\_input

`@abstractmethod`

`@classmethod`

```python
def build_run_input(cls, body: bytes) -> RunInputT
```

Build a protocol-specific run input object from the request body.

###### Returns

`RunInputT`

##### load\_messages

`@abstractmethod`

`@classmethod`

```python
def load_messages(cls, messages: Sequence[MessageT]) -> list[ModelMessage]
```

Transform protocol-specific messages into Pydantic AI messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### dump\_messages

`@classmethod`

```python
def dump_messages(cls, messages: Sequence[ModelMessage]) -> list[MessageT]
```

Transform Pydantic AI messages into protocol-specific messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`MessageT`\]

##### build\_event\_stream

`@abstractmethod`

```python
def build_event_stream() -> UIEventStream[RunInputT, EventT, AgentDepsT, OutputDataT]
```

Build a protocol-specific event stream transformer.

###### Returns

`UIEventStream`\[`RunInputT`, `EventT`, `AgentDepsT`, `OutputDataT`\]

##### sanitize\_messages

```python
def sanitize_messages(
    messages: Sequence[ModelMessage],
    deferred_tool_results: DeferredToolResults | None = None,
) -> list[ModelMessage]
```

Strip parts of client-submitted messages that aren't trusted from the client.

Called on the messages produced from the protocol-specific run input before they're passed to the agent. Caller-supplied `message_history` is not passed through this method -- it is trusted as coming from server-side persistence.

Currently strips:

-   [`SystemPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart)s when [`manage_system_prompt`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.manage_system_prompt) is `'server'`. The agent's configured `system_prompt` is reinjected by [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) on the next model request. If stripping leaves a `ModelRequest` with no parts, the request is dropped from history entirely.
-   [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) parts whose URL scheme is not in [`allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes). Non-HTTP schemes like `s3://` or `gs://` cause the model provider to fetch the object using the server-side IAM role, so they should only be accepted from trusted frontends.
-   [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) and [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) entries at the end of the history that don't have a matching entry in `deferred_tool_results`. Tool calls are produced by the model on the server side, so an unresolved tool call at the end of client-supplied history doesn't correspond to a paused agent run and shouldn't be executed. Tool calls that correspond to a resolution in `deferred_tool_results` are preserved so that human-in-the-loop resumption continues to work. If stripping leaves the final response with no parts, the response is dropped from history entirely.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### transform\_stream

```python
def transform_stream(
    stream: AsyncIterator[NativeEvent],
    on_complete: OnCompleteFunc[EventT] | None = None,
) -> AsyncIterator[EventT]
```

Transform a stream of Pydantic AI events into protocol-specific events.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\]

The stream of Pydantic AI events to transform.

**`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.

##### encode\_stream

```python
def encode_stream(stream: AsyncIterator[EventT]) -> AsyncIterator[str]
```

Encode a stream of protocol-specific events as strings according to the `Accept` header value.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\]

###### Parameters

**`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

The stream of protocol-specific events to encode.

##### streaming\_response

```python
def streaming_response(stream: AsyncIterator[EventT]) -> StreamingResponse
```

Generate a streaming response from a stream of protocol-specific events.

###### Returns

`StreamingResponse`

###### Parameters

**`stream`** : [`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

The stream of protocol-specific events to encode.

##### run\_stream\_native

```python
def run_stream_native(
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
) -> AsyncIterator[NativeEvent]
```

Run the agent with the protocol-specific run input and stream Pydantic AI events.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`NativeEvent`\]

###### Parameters

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools to use for this run.

##### run\_stream

```python
def run_stream(
    output_type: OutputSpec[Any] | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[AgentDepsT] = None,
    deps: AgentDepsT = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[AgentDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    on_complete: OnCompleteFunc[EventT] | None = None,
) -> AsyncIterator[EventT]
```

Run the agent with the protocol-specific run input and stream protocol-specific events.

###### Returns

[`AsyncIterator`](https://docs.python.org/3/library/typing.html#typing.AsyncIterator)\[`EventT`\]

###### Parameters

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`AgentDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `AgentDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`AgentDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`AgentDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools to use for this run.

**`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.

##### dispatch\_request

`@async`

`@classmethod`

```python
def dispatch_request(
    cls,
    request: Request,
    agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT],
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[DispatchDepsT] = None,
    deps: DispatchDepsT = None,
    output_type: OutputSpec[Any] | None = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[DispatchDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    on_complete: OnCompleteFunc[EventT] | None = None,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
    kwargs: Any = {},
) -> Response
```

Handle a protocol-specific HTTP request by running the agent and returning a streaming response of protocol-specific events.

Extra keyword arguments are forwarded to [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request), allowing subclasses to accept additional adapter-specific parameters.

###### Returns

`Response` -- A streaming Starlette response with protocol-specific events encoded per the request's `Accept` header value.

###### Parameters

**`request`** : `Request`

The incoming Starlette/FastAPI request.

**`agent`** : `AbstractAgent`\[`DispatchDepsT`, `DispatchOutputDataT`\]

The agent to run.

**`output_type`** : `OutputSpec`\[[`Any`](https://docs.python.org/3/library/typing.html#typing.Any)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type.

**`message_history`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

History of the conversation so far.

**`deferred_tool_results`** : [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional results for deferred tool calls in the message history.

**`conversation_id`** : [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

ID of the conversation this run belongs to. Pass `'new'` to start a fresh conversation, ignoring any `conversation_id` already on `message_history`. If omitted, falls back to the most recent `conversation_id` on `message_history` or a freshly generated UUID7.

**`model`** : `Model` | `KnownModelName` | [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional model to use for this run, required if `model` was not set when creating the agent.

**`instructions`** : `_instructions.AgentInstructions`\[`DispatchDepsT`\] _Default:_ `None`

Optional additional instructions to use for this run.

**`deps`** : `DispatchDepsT` _Default:_ `None`

Optional dependencies to use for this run.

**`model_settings`** : [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional settings to use for this model's request.

**`usage_limits`** : [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional limits on model request count or token usage.

**`usage`** : [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional usage to start with, useful for resuming a conversation or agents used in tools.

**`metadata`** : `AgentMetadata`\[`DispatchDepsT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext); merged with the agent's configured metadata.

**`infer_name`** : [`bool`](https://docs.python.org/3/library/functions.html#bool) _Default:_ `True`

Whether to try to infer the agent name from the call frame if it's not set.

**`toolsets`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)\[`DispatchDepsT`\]\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional toolsets for this run.

**`builtin_tools`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[`AbstractBuiltinTool`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional additional builtin tools to use for this run.

**`on_complete`** : `OnCompleteFunc`\[`EventT`\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.

**`manage_system_prompt`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['server', 'client'\] _Default:_ `'server'`

Who owns the system prompt. See [`UIAdapter.manage_system_prompt`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.manage_system_prompt).

**`allowed_file_url_schemes`** : [`frozenset`](https://docs.python.org/3/library/stdtypes.html#frozenset)\[[`str`](https://docs.python.org/3/library/stdtypes.html#str)\] _Default:_ `frozenset({'http', 'https'})`

URL schemes allowed for file URL parts from the client. See [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes).

**`**kwargs`** : [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) _Default:_ `{}`

Additional keyword arguments forwarded to [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request).

### SSE\_CONTENT\_TYPE

Content type header value for Server-Sent Events (SSE).

**Default:** `'text/event-stream'`

### NativeEvent

Type alias for the native event type, which is either an `AgentStreamEvent` or an `AgentRunResultEvent`.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `AgentStreamEvent | AgentRunResultEvent[Any]`

### OnCompleteFunc

Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events.

**Type:** [`TypeAlias`](https://docs.python.org/3/library/typing.html#typing.TypeAlias) **Default:** `Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]]`

---

# [pydantic_ai.ui.vercel_ai](https://pydantic.dev/docs/ai/api/ui/vercel_ai/)

# pydantic\_ai.ui.vercel\_ai

Vercel AI protocol adapter for Pydantic AI agents.

This module provides classes for integrating Pydantic AI agents with the Vercel AI protocol, enabling streaming event-based communication for interactive AI applications.

Converted to Python from: [https://github.com/vercel/ai/blob/ai%405.0.34/packages/ai/src/ui/ui-messages.ts](https://github.com/vercel/ai/blob/ai%405.0.34/packages/ai/src/ui/ui-messages.ts)

### VercelAIEventStream

**Bases:** `UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]`

UI event stream transformer for the Vercel AI protocol.

#### Attributes

##### sdk\_version

Vercel AI SDK version to target. Setting to 6 enables tool approval streaming.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[5, 6\] **Default:** `5`

##### server\_message\_id

Optional server-generated message ID to include in the `StartChunk`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### VercelAIAdapter

**Bases:** `UIAdapter[RequestData, UIMessage, BaseChunk, AgentDepsT, OutputDataT]`

UI adapter for the Vercel AI protocol.

#### Attributes

##### sdk\_version

Vercel AI SDK version to target. Default is 5 for backwards compatibility.

Setting `sdk_version=6` enables tool approval streaming for human-in-the-loop workflows.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[5, 6\] **Default:** `5`

##### server\_message\_id

Optional server-generated message ID to include in the `StartChunk`.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### deferred\_tool\_results

Extract deferred tool results from Vercel AI messages with approval responses.

**Type:** [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) | [`None`](https://docs.python.org/3/library/constants.html#None)

##### messages

Pydantic AI messages from the Vercel AI run input.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### conversation\_id

Conversation ID from the top-level `id` field of the Vercel AI request body (the chat ID).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None)

#### Methods

##### build\_run\_input

`@classmethod`

```python
def build_run_input(cls, body: bytes) -> RequestData
```

Build a Vercel AI run input object from the request body.

###### Returns

`RequestData`

##### from\_request

`@async`

`@classmethod`

```python
def from_request(
    cls,
    request: Request,
    agent: AbstractAgent[AgentDepsT, OutputDataT],
    sdk_version: Literal[5, 6] = 5,
    server_message_id: str | None = None,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
    kwargs: Any = {},
) -> VercelAIAdapter[AgentDepsT, OutputDataT]
```

Extends [`from_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) with Vercel AI-specific parameters.

###### Returns

`VercelAIAdapter`\[`AgentDepsT`, `OutputDataT`\]

##### dispatch\_request

`@async`

`@classmethod`

```python
def dispatch_request(
    cls,
    request: Request,
    agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT],
    sdk_version: Literal[5, 6] = 5,
    server_message_id: str | None = None,
    message_history: Sequence[ModelMessage] | None = None,
    deferred_tool_results: DeferredToolResults | None = None,
    conversation_id: str | None = None,
    model: Model | KnownModelName | str | None = None,
    instructions: _instructions.AgentInstructions[DispatchDepsT] = None,
    deps: DispatchDepsT = None,
    output_type: OutputSpec[Any] | None = None,
    model_settings: ModelSettings | None = None,
    usage_limits: UsageLimits | None = None,
    usage: RunUsage | None = None,
    metadata: AgentMetadata[DispatchDepsT] | None = None,
    infer_name: bool = True,
    toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None,
    builtin_tools: Sequence[AbstractBuiltinTool] | None = None,
    on_complete: OnCompleteFunc[BaseChunk] | None = None,
    manage_system_prompt: Literal['server', 'client'] = 'server',
    allowed_file_url_schemes: frozenset[str] = frozenset({'http', 'https'}),
    kwargs: Any = {},
) -> Response
```

Extends [`dispatch_request`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) with Vercel AI-specific parameters.

###### Returns

`Response`

##### build\_event\_stream

```python
def build_event_stream(

) -> UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]
```

Build a Vercel AI event stream transformer.

###### Returns

`UIEventStream`\[`RequestData`, `BaseChunk`, `AgentDepsT`, `OutputDataT`\]

##### load\_messages

`@classmethod`

```python
def load_messages(cls, messages: Sequence[UIMessage]) -> list[ModelMessage]
```

Transform Vercel AI messages into Pydantic AI messages.

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

##### dump\_messages

`@classmethod`

```python
def dump_messages(
    cls,
    messages: Sequence[ModelMessage],
    generate_message_id: Callable[[ModelRequest | ModelResponse, Literal['system', 'user', 'assistant'], int], str] | None = None,
    sdk_version: Literal[5, 6] = 5,
) -> list[UIMessage]
```

Transform Pydantic AI messages into Vercel AI messages.

When `sdk_version=6`, tool calls that have no corresponding result in the message history are automatically detected as deferred and emitted with `state='approval-requested'`, so the frontend can render approve/reject buttons on reload. On v5, such tool calls are emitted with `state='input-available'` (approval states are v6-only).

###### Returns

[`list`](https://docs.python.org/3/glossary.html#term-list)\[`UIMessage`\] -- A list of UIMessage objects in Vercel AI format

###### Parameters

**`messages`** : [`Sequence`](https://docs.python.org/3/library/typing.html#typing.Sequence)\[[`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)\]

A sequence of ModelMessage objects to convert

**`generate_message_id`** : [`Callable`](https://docs.python.org/3/library/typing.html#typing.Callable)\[\[[`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) | [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse), [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system', 'user', 'assistant'\], [`int`](https://docs.python.org/3/library/functions.html#int)\], [`str`](https://docs.python.org/3/library/stdtypes.html#str)\] | [`None`](https://docs.python.org/3/library/constants.html#None) _Default:_ `None`

Optional custom function to generate message IDs. If provided, it receives the message, the role ('system', 'user', or 'assistant'), and the message index (incremented per UIMessage appended), and should return a unique string ID. If not provided, uses `provider_response_id` for responses, run\_id-based IDs for messages with run\_id, or a deterministic UUID5 fallback.

**`sdk_version`** : [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\[5, 6\] _Default:_ `5`

Vercel AI SDK version to target. Defaults to 5 for backwards compatibility. Set to 6 to emit tool approval parts for deferred tool calls.

Vercel AI request types (UI messages).

Converted to Python from: [https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui/ui-messages.ts](https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui/ui-messages.ts)

Tool approval types (`ToolApprovalRequested`, `ToolApprovalResponded`) require AI SDK v6 or later.

### BaseUIPart

**Bases:** `CamelBaseModel`, `ABC`

Abstract base class for all UI parts.

### TextUIPart

**Bases:** `BaseUIPart`

A text part of a message.

#### Attributes

##### text

The text content.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### state

The state of the text part.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['streaming', 'done'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_metadata

The provider metadata.

**Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### ReasoningUIPart

**Bases:** `BaseUIPart`

A reasoning part of a message.

#### Attributes

##### text

The reasoning text.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### state

The state of the reasoning part.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['streaming', 'done'\] | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### provider\_metadata

The provider metadata.

**Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### SourceUrlUIPart

**Bases:** `BaseUIPart`

A source part of a message.

### SourceDocumentUIPart

**Bases:** `BaseUIPart`

A document source part of a message.

### FileUIPart

**Bases:** `BaseUIPart`

A file part of a message.

#### Attributes

##### media\_type

IANA media type of the file. @see [https://www.iana.org/assignments/media-types/media-types.xhtml](https://www.iana.org/assignments/media-types/media-types.xhtml)

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### filename

Optional filename of the file.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### url

The URL of the file. It can either be a URL to a hosted file or a [Data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs).

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### provider\_metadata

The provider metadata.

**Type:** `ProviderMetadata` | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### StepStartUIPart

**Bases:** `BaseUIPart`

A step boundary part of a message.

### DataUIPart

**Bases:** `BaseUIPart`

Data part with dynamic type based on data name.

### ToolApprovalRequested

**Bases:** `CamelBaseModel`

Tool approval in requested state (awaiting user response).

#### Attributes

##### id

The approval request ID.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

### ToolApprovalResponded

**Bases:** `CamelBaseModel`

Tool approval in responded state (user has approved or denied).

#### Attributes

##### id

The approval request ID.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### approved

Whether the user approved the tool call.

**Type:** [`bool`](https://docs.python.org/3/library/functions.html#bool)

##### reason

Optional reason for the approval or denial.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

### ToolInputStreamingPart

**Bases:** `BaseUIPart`

Tool part in input-streaming state.

### ToolInputAvailablePart

**Bases:** `BaseUIPart`

Tool part in input-available state.

### ToolOutputAvailablePart

**Bases:** `BaseUIPart`

Tool part in output-available state.

### ToolOutputErrorPart

**Bases:** `BaseUIPart`

Tool part in output-error state.

### ToolApprovalRequestedPart

**Bases:** `BaseUIPart`

Tool part in approval-requested state (awaiting user decision).

### ToolApprovalRespondedPart

**Bases:** `BaseUIPart`

Tool part in approval-responded state (user approved/denied, execution pending).

### ToolOutputDeniedPart

**Bases:** `BaseUIPart`

Tool part in output-denied state (tool was denied, terminal state).

### DynamicToolInputStreamingPart

**Bases:** `BaseUIPart`

Dynamic tool part in input-streaming state.

### DynamicToolInputAvailablePart

**Bases:** `BaseUIPart`

Dynamic tool part in input-available state.

### DynamicToolOutputAvailablePart

**Bases:** `BaseUIPart`

Dynamic tool part in output-available state.

### DynamicToolOutputErrorPart

**Bases:** `BaseUIPart`

Dynamic tool part in output-error state.

### DynamicToolApprovalRequestedPart

**Bases:** `BaseUIPart`

Dynamic tool part in approval-requested state (awaiting user decision).

### DynamicToolApprovalRespondedPart

**Bases:** `BaseUIPart`

Dynamic tool part in approval-responded state (user approved/denied, execution pending).

### DynamicToolOutputDeniedPart

**Bases:** `BaseUIPart`

Dynamic tool part in output-denied state (tool was denied, terminal state).

### UIMessage

**Bases:** `CamelBaseModel`

A message as displayed in the UI by Vercel AI Elements.

#### Attributes

##### id

A unique identifier for the message.

**Type:** [`str`](https://docs.python.org/3/library/stdtypes.html#str)

##### role

The role of the message.

**Type:** [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal)\['system', 'user', 'assistant'\]

##### metadata

The metadata of the message.

**Type:** [`Any`](https://docs.python.org/3/library/typing.html#typing.Any) | [`None`](https://docs.python.org/3/library/constants.html#None) **Default:** `None`

##### parts

The parts of the message. Use this for rendering the message in the UI. System messages should be avoided (set the system prompt on the server instead). They can have text parts. User messages can have text parts and file parts. Assistant messages can have text, reasoning, tool invocation, and file parts.

**Type:** [`list`](https://docs.python.org/3/glossary.html#term-list)\[`UIMessagePart`\]

### SubmitMessage

**Bases:** `CamelBaseModel`

Submit message request.

### RegenerateMessage

**Bases:** `CamelBaseModel`

Ask the agent to regenerate a message.

### ProviderMetadata

Provider metadata.

**Default:** `dict[str, dict[str, JSONValue]]`

### ToolApproval

Union of tool approval states.

**Default:** `ToolApprovalRequested | ToolApprovalResponded`

### ToolUIPart

Union of all tool part types.

**Default:** `ToolInputStreamingPart | ToolInputAvailablePart | ToolOutputAvailablePart | ToolOutputErrorPart | ToolApprovalRequestedPart | ToolApprovalRespondedPart | ToolOutputDeniedPart`

### DynamicToolUIPart

Union of all dynamic tool part types.

**Default:** `DynamicToolInputStreamingPart | DynamicToolInputAvailablePart | DynamicToolOutputAvailablePart | DynamicToolOutputErrorPart | DynamicToolApprovalRequestedPart | DynamicToolApprovalRespondedPart | DynamicToolOutputDeniedPart`

### UIMessagePart

Union of all message part types.

**Default:** `TextUIPart | ReasoningUIPart | ToolUIPart | DynamicToolUIPart | SourceUrlUIPart | SourceDocumentUIPart | FileUIPart | DataUIPart | StepStartUIPart`

### RequestData

Union of all request data types.

**Default:** `Annotated[SubmitMessage | RegenerateMessage, Discriminator('trigger')]`

Vercel AI response types (SSE chunks).

Converted to Python from: [https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui-message-stream/ui-message-chunks.ts](https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui-message-stream/ui-message-chunks.ts)

Tool approval types (`ToolApprovalRequestChunk`, `ToolOutputDeniedChunk`) require AI SDK UI v6 or later.

### BaseChunk

**Bases:** `CamelBaseModel`, `ABC`

Abstract base class for response SSE events.

### TextStartChunk

**Bases:** `BaseChunk`

Text start chunk.

### TextDeltaChunk

**Bases:** `BaseChunk`

Text delta chunk.

### TextEndChunk

**Bases:** `BaseChunk`

Text end chunk.

### ReasoningStartChunk

**Bases:** `BaseChunk`

Reasoning start chunk.

### ReasoningDeltaChunk

**Bases:** `BaseChunk`

Reasoning delta chunk.

### ReasoningEndChunk

**Bases:** `BaseChunk`

Reasoning end chunk.

### ErrorChunk

**Bases:** `BaseChunk`

Error chunk.

### ToolInputStartChunk

**Bases:** `BaseChunk`

Tool input start chunk.

### ToolInputDeltaChunk

**Bases:** `BaseChunk`

Tool input delta chunk.

### ToolOutputAvailableChunk

**Bases:** `BaseChunk`

Tool output available chunk.

### ToolInputAvailableChunk

**Bases:** `BaseChunk`

Tool input available chunk.

### ToolInputErrorChunk

**Bases:** `BaseChunk`

Tool input error chunk.

### ToolOutputErrorChunk

**Bases:** `BaseChunk`

Tool output error chunk.

### ToolApprovalRequestChunk

**Bases:** `BaseChunk`

Tool approval request chunk for human-in-the-loop approval.

Requires AI SDK UI v6 or later.

### ToolOutputDeniedChunk

**Bases:** `BaseChunk`

Tool output denied chunk when user denies tool execution.

Requires AI SDK UI v6 or later.

### SourceUrlChunk

**Bases:** `BaseChunk`

Source URL chunk.

### SourceDocumentChunk

**Bases:** `BaseChunk`

Source document chunk.

### FileChunk

**Bases:** `BaseChunk`

File chunk.

### DataChunk

**Bases:** `BaseChunk`

Data chunk with dynamic type.

### StartStepChunk

**Bases:** `BaseChunk`

Start step chunk.

### FinishStepChunk

**Bases:** `BaseChunk`

Finish step chunk.

### StartChunk

**Bases:** `BaseChunk`

Start chunk.

### FinishChunk

**Bases:** `BaseChunk`

Finish chunk.

### AbortChunk

**Bases:** `BaseChunk`

Abort chunk.

### MessageMetadataChunk

**Bases:** `BaseChunk`

Message metadata chunk.

### DoneChunk

**Bases:** `BaseChunk`

Done chunk.

### ProviderMetadata

Provider metadata.

**Default:** `dict[str, dict[str, JSONValue]]`

### FinishReason

Reason why the model finished generating.

**Default:** `Literal['stop', 'length', 'content-filter', 'tool-calls', 'error', 'other'] | None`

---

# [Agents](https://pydantic.dev/docs/ai/core-concepts/agent/)

# Agents

## Introduction

Agents are Pydantic AI's primary interface for interacting with LLMs.

In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows.

The [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class has full API documentation, but conceptually you can think of an agent as a container for:

**Component**

**Description**

[Instructions](#instructions)

A set of instructions for the LLM written by the developer.

[Function tool(s)](/docs/ai/tools-toolsets/tools) and [toolsets](/docs/ai/tools-toolsets/toolsets)

Functions that the LLM may call to get information while generating a response.

[Structured output type](/docs/ai/core-concepts/output)

The structured datatype the LLM must return at the end of a run, if specified.

[Dependency type constraint](/docs/ai/core-concepts/dependencies)

Dynamic instructions functions, tools, and output functions may all use dependencies when they're run.

[LLM model](/docs/ai/models/base)

Optional default LLM model associated with the agent. Can also be specified when running the agent.

[Model Settings](#additional-configuration)

Optional default model settings to help fine tune requests. Can also be specified when running the agent.

[Capabilities](/docs/ai/core-concepts/capabilities)

Reusable bundles of tools, hooks, instructions, and model settings that extend agent behavior.

While each of these can be configured individually, [capabilities](/docs/ai/core-concepts/capabilities) let you bundle related behavior into reusable units that are easier to compose, share, and [load from configuration files](/docs/ai/core-concepts/agent-spec).

In typing terms, agents are generic in their dependency and output types, e.g., an agent which required dependencies of type `Foobar` and produced outputs of type `list[str]` would have type `Agent[Foobar, list[str]]`. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use [static type checking](#static-type-checking) it should work well with Pydantic AI.

Here's a toy example of an agent that simulates a roulette wheel:

roulette\_wheel.py

```python
from pydantic_ai import Agent, RunContext

roulette_agent = Agent(  # (1)
  'openai:gpt-5.2',
  deps_type=int,
  output_type=bool,
  system_prompt=(
      'Use the `roulette_wheel` function to see if the '
      'customer has won based on the number they provide.'
  ),
)


@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str:  # (2)
  """check if the square is a winner"""
  return 'winner' if square == ctx.deps else 'loser'


# Run the agent
success_number = 18  # (3)
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.output)  # (4)
#> True

result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.output)
#> False
```

Create an agent, which expects an integer dependency and produces a boolean output. This agent will have type `Agent[int, bool]`.

Define a tool that checks if the square is a winner. Here [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is parameterized with the dependency type `int`; if you got the dependency type wrong you'd get a typing error.

In reality, you might want to use a random number here e.g. `random.randint(0, 36)`.

`result.output` will be a boolean indicating if the square is a winner. Pydantic performs the output validation, and it'll be typed as a `bool` since its type is derived from the `output_type` generic parameter of the agent.

Agents are designed for reuse, like FastAPI Apps

You can instantiate one agent and use it globally throughout your application, as you would a small [FastAPI](https://fastapi.tiangolo.com/reference/fastapi/#fastapi.FastAPI) app or an [APIRouter](https://fastapi.tiangolo.com/reference/apirouter/#fastapi.APIRouter), or dynamically create as many agents as you want. Both are valid and supported ways to use agents.

## Running Agents

There are five ways to run an agent:

1.  [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) -- an async function which returns a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) containing a completed response.
2.  [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync) -- a plain, synchronous function which returns a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) containing a completed response (internally, this just calls `loop.run_until_complete(self.run())`).
3.  [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) -- an async context manager which returns a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), which contains methods to stream text and structured output as an async iterable. [`agent.run_stream_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_sync) is a synchronous variation that returns a [`StreamedRunResultSync`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResultSync) with synchronous versions of the same methods.
4.  [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) -- a function which returns an `AgentEventStream` async context manager that yields [`AgentStreamEvent`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) and a [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) containing the final run result.
5.  [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) -- a context manager which returns an [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), an async iterable over the nodes of the agent's underlying [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph).

Here's a simple example demonstrating the first four:

run\_agent.py

```python
from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent

agent = Agent('openai:gpt-5.2')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.output)
#> The capital of Italy is Rome.


async def main():
    result = await agent.run('What is the capital of France?')
    print(result.output)
    #> The capital of France is Paris.

    async with agent.run_stream('What is the capital of the UK?') as response:
        async for text in response.stream_text():
            print(text)
            #> The capital of
            #> The capital of the UK is
            #> The capital of the UK is London.

    events: list[AgentStreamEvent | AgentRunResultEvent] = []
    async with agent.run_stream_events('What is the capital of Mexico?') as stream:
        async for event in stream:
            events.append(event)
    print(events)
    """
    [
        PartStartEvent(index=0, part=TextPart(content='The capital of ')),
        FinalResultEvent(tool_name=None, tool_call_id=None),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='Mexico is Mexico ')),
        PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='City.')),
        PartEndEvent(
            index=0, part=TextPart(content='The capital of Mexico is Mexico City.')
        ),
        AgentRunResultEvent(
            result=AgentRunResult(output='The capital of Mexico is Mexico City.')
        ),
    ]
    """
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

You can also pass messages from previous runs to continue a conversation or provide context, as described in [Messages and Chat History](/docs/ai/core-concepts/message-history).

### Streaming Events and Final Output

As shown in the example above, [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) makes it easy to stream the agent's final output as it comes in. It also takes an optional `event_stream_handler` argument that you can use to gain insight into what is happening during the run before the final output is produced.

The example below shows how to stream events and text output. You can also [stream structured output](/docs/ai/core-concepts/output#streaming-structured-output).

Note

The `run_stream()` and `run_stream_sync()` methods will consider the first output that matches the [output type](/docs/ai/core-concepts/output#structured-output) (which could be text, an [output tool](/docs/ai/core-concepts/output#tool-output) call, or a [deferred](/docs/ai/tools-toolsets/deferred-tools) tool call) to be the final output of the agent run, even when the model generates (additional) tool calls after this "final" output.

These "dangling" tool calls will not be executed unless the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy) is set to `'graceful'` or `'exhaustive'`, and even then their results will not be sent back to the model as the agent run will already be considered completed. In short, if the model returns both tool calls and text, and the agent's output type is `str`, **the tool calls will not run** in streaming mode with the default setting.

If you want to always keep running the agent when it performs tool calls, and stream all events from the model's streaming response and the agent's execution of tools, use [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) instead, as described in the following sections.

run\_stream\_event\_stream\_handler.py

```python
import asyncio
from collections.abc import AsyncIterable
from datetime import date

from pydantic_ai import (
    Agent,
    AgentStreamEvent,
    FinalResultEvent,
    FunctionToolCallEvent,
    FunctionToolResultEvent,
    PartDeltaEvent,
    PartStartEvent,
    RunContext,
    TextPartDelta,
    ThinkingPartDelta,
    ToolCallPartDelta,
)

weather_agent = Agent(
    'openai:gpt-5.2',
    system_prompt='Providing a weather forecast at the locations the user provides.',
)


@weather_agent.tool
async def weather_forecast(
    ctx: RunContext,
    location: str,
    forecast_date: date,
) -> str:
    return f'The forecast in {location} on {forecast_date} is 24°C and sunny.'


output_messages: list[str] = []

async def handle_event(event: AgentStreamEvent):
    if isinstance(event, PartStartEvent):
        output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}')
    elif isinstance(event, PartDeltaEvent):
        if isinstance(event.delta, TextPartDelta):
            output_messages.append(f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}')
        elif isinstance(event.delta, ThinkingPartDelta):
            output_messages.append(f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}')
        elif isinstance(event.delta, ToolCallPartDelta):
            output_messages.append(f'[Request] Part {event.index} args delta: {event.delta.args_delta}')
    elif isinstance(event, FunctionToolCallEvent):
        output_messages.append(
            f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})'
        )
    elif isinstance(event, FunctionToolResultEvent):
        output_messages.append(f'[Tools] Tool call {event.tool_call_id!r} returned => {event.part.content}')
    elif isinstance(event, FinalResultEvent):
        output_messages.append(f'[Result] The model starting producing a final result (tool_name={event.tool_name})')


async def event_stream_handler(
    ctx: RunContext,
    event_stream: AsyncIterable[AgentStreamEvent],
):
    async for event in event_stream:
        await handle_event(event)

async def main():
    user_prompt = 'What will the weather be like in Paris on Tuesday?'

    async with weather_agent.run_stream(user_prompt, event_stream_handler=event_stream_handler) as run:
        async for output in run.stream_text():
            output_messages.append(f'[Output] {output}')


if __name__ == '__main__':
    asyncio.run(main())

    print(output_messages)
    """
    [
        "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')",
        '[Request] Part 0 args delta: {"location":"Pa',
        '[Request] Part 0 args delta: ris","forecast_',
        '[Request] Part 0 args delta: date":"2030-01-',
        '[Request] Part 0 args delta: 01"}',
        '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')',
        "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.",
        "[Request] Starting part 0: TextPart(content='It will be ')",
        '[Result] The model starting producing a final result (tool_name=None)',
        '[Output] It will be ',
        '[Output] It will be warm and sunny ',
        '[Output] It will be warm and sunny in Paris on ',
        '[Output] It will be warm and sunny in Paris on Tuesday.',
    ]
    """
```

_(This example is complete, it can be run "as is")_

### Streaming All Events

Like `agent.run_stream()`, [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) takes an optional `event_stream_handler` argument that lets you stream all events from the model's streaming response and the agent's execution of tools. Unlike `run_stream()`, it always runs the agent graph to completion even if text was received ahead of tool calls that looked like it could've been the final result.

For convenience, a [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) method is also available as a wrapper around `run(event_stream_handler=...)`, which returns an `AgentEventStream` async context manager that yields [`AgentStreamEvent`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) and a [`AgentRunResultEvent`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResultEvent) containing the final run result.

Note

As they return raw events as they come in, the `run_stream_events()` and `run(event_stream_handler=...)` methods require you to piece together the streamed text and structured output yourself from the `PartStartEvent` and subsequent `PartDeltaEvent`s.

To get the best of both worlds, at the expense of some additional complexity, you can use [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) as described in the next section, which lets you [iterate over the agent graph](#iterating-over-an-agents-graph) and [stream both events and output](#streaming-all-events-and-output) at every step.

run\_events.py

```python
import asyncio

from pydantic_ai import AgentRunResultEvent

from run_stream_event_stream_handler import handle_event, output_messages, weather_agent


async def main():
    user_prompt = 'What will the weather be like in Paris on Tuesday?'

    async with weather_agent.run_stream_events(user_prompt) as stream:
        async for event in stream:
            if isinstance(event, AgentRunResultEvent):
                output_messages.append(f'[Final Output] {event.result.output}')
            else:
                await handle_event(event)

if __name__ == '__main__':
    asyncio.run(main())

    print(output_messages)
    """
    [
        "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')",
        '[Request] Part 0 args delta: {"location":"Pa',
        '[Request] Part 0 args delta: ris","forecast_',
        '[Request] Part 0 args delta: date":"2030-01-',
        '[Request] Part 0 args delta: 01"}',
        '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')',
        "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.",
        "[Request] Starting part 0: TextPart(content='It will be ')",
        '[Result] The model starting producing a final result (tool_name=None)',
        "[Request] Part 0 text delta: 'warm and sunny '",
        "[Request] Part 0 text delta: 'in Paris on '",
        "[Request] Part 0 text delta: 'Tuesday.'",
        '[Final Output] It will be warm and sunny in Paris on Tuesday.',
    ]
    """
```

_(This example is complete, it can be run "as is")_

### Iterating Over an Agent's Graph

Under the hood, each `Agent` in Pydantic AI uses **pydantic-graph** to manage its execution flow. **pydantic-graph** is a generic, type-centric library for building and running finite state machines in Python. It doesn't actually depend on Pydantic AI -- you can use it standalone for workflows that have nothing to do with GenAI -- but Pydantic AI makes use of it to orchestrate the handling of model requests and model responses in an agent's run.

In many scenarios, you don't need to worry about pydantic-graph at all; calling `agent.run(...)` simply traverses the underlying graph from start to finish. However, if you need deeper insight or control -- for example to inject your own logic at specific stages -- Pydantic AI exposes the lower-level iteration process via [`Agent.iter`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). This method returns an [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), which you can async-iterate over, or manually drive node-by-node via the [`next`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next) method. Once the agent's graph returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End), you have the final result along with a detailed history of all steps.

#### `async for` iteration

Here's an example of using `async for` with `iter` to record each node the agent executes:

agent\_iter\_async\_for.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')


async def main():
    nodes = []
    # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph
    async with agent.iter('What is the capital of France?') as agent_run:
        async for node in agent_run:
            # Each node represents a step in the agent's execution
            nodes.append(node)
    print(nodes)
    """
    [
        UserPromptNode(
            user_prompt='What is the capital of France?',
            instructions_functions=[],
            system_prompts=(),
            system_prompt_functions=[],
            system_prompt_dynamic_functions={},
        ),
        ModelRequestNode(
            request=ModelRequest(
                parts=[
                    UserPromptPart(
                        content='What is the capital of France?',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        CallToolsNode(
            model_response=ModelResponse(
                parts=[TextPart(content='The capital of France is Paris.')],
                usage=RequestUsage(input_tokens=56, output_tokens=7),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            )
        ),
        End(data=FinalResult(output='The capital of France is Paris.')),
    ]
    """
    print(agent_run.result.output)
    #> The capital of France is Paris.
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

-   The `AgentRun` is an async iterator that yields each node (`BaseNode` or `End`) in the flow.
-   The run ends when an `End` node is returned.

#### Using `.next(...)` manually

You can also drive the iteration manually by passing the node you want to run next to the `AgentRun.next(...)` method. This allows you to inspect or modify the node before it executes or skip nodes based on your own logic, and to catch errors in `next()` more easily:

agent\_iter\_next.py

```python
from pydantic_ai import Agent
from pydantic_graph import End

agent = Agent('openai:gpt-5.2')


async def main():
  async with agent.iter('What is the capital of France?') as agent_run:
      node = agent_run.next_node  # (1)

      all_nodes = [node]

      # Drive the iteration manually:
      while not isinstance(node, End):  # (2)
          node = await agent_run.next(node)  # (3)
          all_nodes.append(node)  # (4)

      print(all_nodes)
      """
      [
          UserPromptNode(
              user_prompt='What is the capital of France?',
              instructions_functions=[],
              system_prompts=(),
              system_prompt_functions=[],
              system_prompt_dynamic_functions={},
          ),
          ModelRequestNode(
              request=ModelRequest(
                  parts=[
                      UserPromptPart(
                          content='What is the capital of France?',
                          timestamp=datetime.datetime(...),
                      )
                  ],
                  timestamp=datetime.datetime(...),
                  run_id='...',
                  conversation_id='...',
              )
          ),
          CallToolsNode(
              model_response=ModelResponse(
                  parts=[TextPart(content='The capital of France is Paris.')],
                  usage=RequestUsage(input_tokens=56, output_tokens=7),
                  model_name='gpt-5.2',
                  timestamp=datetime.datetime(...),
                  run_id='...',
                  conversation_id='...',
              )
          ),
          End(data=FinalResult(output='The capital of France is Paris.')),
      ]
      """
```

We start by grabbing the first node that will be run in the agent's graph.

The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`.

When you call `await agent_run.next(node)`, it executes that node in the agent's graph, updates the run's history, and returns the _next_ node to run.

You could also inspect or mutate the new `node` here as needed.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

#### Accessing usage and final output

You can retrieve usage statistics (tokens, requests, etc.) at any time from the [`AgentRun`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun) object via `agent_run.usage()`. This method returns a [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) object containing the usage data.

Once the run finishes, `agent_run.result` becomes an [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) object containing the final output (and related metadata).

#### Streaming All Events and Output

Here is an example of streaming an agent run in combination with `async for` iteration:

streaming\_iter.py

```python
import asyncio
from dataclasses import dataclass
from datetime import date

from pydantic_ai import (
    Agent,
    FinalResultEvent,
    FunctionToolCallEvent,
    FunctionToolResultEvent,
    PartDeltaEvent,
    PartStartEvent,
    RunContext,
    TextPartDelta,
    ThinkingPartDelta,
    ToolCallPartDelta,
)


@dataclass
class WeatherService:
    async def get_forecast(self, location: str, forecast_date: date) -> str:
        # In real code: call weather API, DB queries, etc.
        return f'The forecast in {location} on {forecast_date} is 24°C and sunny.'

    async def get_historic_weather(self, location: str, forecast_date: date) -> str:
        # In real code: call a historical weather API or DB
        return f'The weather in {location} on {forecast_date} was 18°C and partly cloudy.'


weather_agent = Agent[WeatherService, str](
    'openai:gpt-5.2',
    deps_type=WeatherService,
    output_type=str,  # We'll produce a final answer as plain text
    system_prompt='Providing a weather forecast at the locations the user provides.',
)


@weather_agent.tool
async def weather_forecast(
    ctx: RunContext[WeatherService],
    location: str,
    forecast_date: date,
) -> str:
    if forecast_date >= date.today():
        return await ctx.deps.get_forecast(location, forecast_date)
    else:
        return await ctx.deps.get_historic_weather(location, forecast_date)


output_messages: list[str] = []


async def main():
    user_prompt = 'What will the weather be like in Paris on Tuesday?'

    # Begin a node-by-node, streaming iteration
    async with weather_agent.iter(user_prompt, deps=WeatherService()) as run:
        async for node in run:
            if Agent.is_user_prompt_node(node):
                # A user prompt node => The user has provided input
                output_messages.append(f'=== UserPromptNode: {node.user_prompt} ===')
            elif Agent.is_model_request_node(node):
                # A model request node => We can stream tokens from the model's request
                output_messages.append('=== ModelRequestNode: streaming partial request tokens ===')
                async with node.stream(run.ctx) as request_stream:
                    final_result_found = False
                    async for event in request_stream:
                        if isinstance(event, PartStartEvent):
                            output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}')
                        elif isinstance(event, PartDeltaEvent):
                            if isinstance(event.delta, TextPartDelta):
                                output_messages.append(
                                    f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}'
                                )
                            elif isinstance(event.delta, ThinkingPartDelta):
                                output_messages.append(
                                    f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}'
                                )
                            elif isinstance(event.delta, ToolCallPartDelta):
                                output_messages.append(
                                    f'[Request] Part {event.index} args delta: {event.delta.args_delta}'
                                )
                        elif isinstance(event, FinalResultEvent):
                            output_messages.append(
                                f'[Result] The model started producing a final result (tool_name={event.tool_name})'
                            )
                            final_result_found = True
                            break

                    if final_result_found:
                        # Once the final result is found, we can call `AgentStream.stream_text()` to stream the text.
                        # A similar `AgentStream.stream_output()` method is available to stream structured output.
                        async for output in request_stream.stream_text():
                            output_messages.append(f'[Output] {output}')
            elif Agent.is_call_tools_node(node):
                # A handle-response node => The model returned some data, potentially calls a tool
                output_messages.append('=== CallToolsNode: streaming partial response & tool usage ===')
                async with node.stream(run.ctx) as handle_stream:
                    async for event in handle_stream:
                        if isinstance(event, FunctionToolCallEvent):
                            output_messages.append(
                                f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})'
                            )
                        elif isinstance(event, FunctionToolResultEvent):
                            output_messages.append(
                                f'[Tools] Tool call {event.tool_call_id!r} returned => {event.part.content}'
                            )
            elif Agent.is_end_node(node):
                # Once an End node is reached, the agent run is complete
                assert run.result is not None
                assert run.result.output == node.data.output
                output_messages.append(f'=== Final Agent Output: {run.result.output} ===')


if __name__ == '__main__':
    asyncio.run(main())

    print(output_messages)
    """
    [
        '=== UserPromptNode: What will the weather be like in Paris on Tuesday? ===',
        '=== ModelRequestNode: streaming partial request tokens ===',
        "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')",
        '[Request] Part 0 args delta: {"location":"Pa',
        '[Request] Part 0 args delta: ris","forecast_',
        '[Request] Part 0 args delta: date":"2030-01-',
        '[Request] Part 0 args delta: 01"}',
        '=== CallToolsNode: streaming partial response & tool usage ===',
        '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')',
        "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.",
        '=== ModelRequestNode: streaming partial request tokens ===',
        "[Request] Starting part 0: TextPart(content='It will be ')",
        '[Result] The model started producing a final result (tool_name=None)',
        '[Output] It will be ',
        '[Output] It will be warm and sunny ',
        '[Output] It will be warm and sunny in Paris on ',
        '[Output] It will be warm and sunny in Paris on Tuesday.',
        '=== CallToolsNode: streaming partial response & tool usage ===',
        '=== Final Agent Output: It will be warm and sunny in Paris on Tuesday. ===',
    ]
    """
```

_(This example is complete, it can be run "as is")_

### Additional Configuration

#### Usage Limits

Pydantic AI offers a [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) structure to help you limit your usage (tokens, requests, and tool calls) on model runs.

You can apply these settings by passing the `usage_limits` argument to the `run{_sync,_stream}` functions.

Consider the following example, where we limit the number of response tokens:

```python
from pydantic_ai import Agent, UsageLimitExceeded, UsageLimits

agent = Agent('anthropic:claude-sonnet-4-6')

result_sync = agent.run_sync(
    'What is the capital of Italy? Answer with just the city.',
    usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.output)
#> Rome
print(result_sync.usage())
#> RunUsage(input_tokens=62, output_tokens=1, requests=1)

try:
    result_sync = agent.run_sync(
        'What is the capital of Italy? Answer with a paragraph.',
        usage_limits=UsageLimits(response_tokens_limit=10),
    )
except UsageLimitExceeded as e:
    print(e)
    #> Exceeded the output_tokens_limit of 10 (output_tokens=32)
```

Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling:

```python
from typing_extensions import TypedDict

from pydantic_ai import Agent, ModelRetry, UsageLimitExceeded, UsageLimits


class NeverOutputType(TypedDict):
  """
  Never ever coerce data to this type.
  """

  never_use_this: str


agent = Agent(
  'anthropic:claude-sonnet-4-6',
  tool_retries=3,
  output_type=NeverOutputType,
  system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.',
)


@agent.tool_plain(retries=5)  # (1)
def infinite_retry_tool() -> int:
  raise ModelRetry('Please try again.')


try:
  result_sync = agent.run_sync(
      'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3)  # (2)
  )
except UsageLimitExceeded as e:
  print(e)
  #> The next request would exceed the request_limit of 3
```

This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop.

This run will error after 3 requests, preventing the infinite tool calling.

##### Capping tool calls

If you need a limit on the number of successful tool invocations within a single run, use `tool_calls_limit`:

```python
from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits

agent = Agent('anthropic:claude-sonnet-4-6')

@agent.tool_plain
def do_work() -> str:
    return 'ok'

try:
    # Allow at most one executed tool call in this run
    agent.run_sync('Please call the tool twice', usage_limits=UsageLimits(tool_calls_limit=1))
except UsageLimitExceeded as e:
    print(e)
    #> The next tool call(s) would exceed the tool_calls_limit of 1 (tool_calls=2).
```

Note

-   Usage limits are especially relevant if you've registered many tools. Use `request_limit` to bound the number of model turns, and `tool_calls_limit` to cap the number of successful tool executions within a run.
-   The `tool_calls_limit` is checked before executing tool calls. If the model returns parallel tool calls that would exceed the limit, no tools will be executed.

#### Model (Run) Settings

Pydantic AI offers a [`settings.ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as `temperature`, `max_tokens`, `timeout`, and more.

There are three ways to apply these settings, with a clear precedence order:

1.  **Model-level defaults** - Set when creating a model instance via the `settings` parameter. These serve as the base defaults for that model.
2.  **Agent-level defaults** - Set during [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) initialization via the `model_settings` argument. These are merged with model defaults, with agent settings taking precedence.
3.  **Run-time overrides** - Passed to `run{_sync,_stream}` functions via the `model_settings` argument. These have the highest priority and are merged with the combined agent and model defaults.

For example, if you'd like to set the `temperature` setting to `0.0` to ensure less random behavior, you can do the following:

```py
from pydantic_ai import Agent, ModelSettings
from pydantic_ai.models.openai import OpenAIChatModel

# 1. Model-level defaults
model = OpenAIChatModel(
    'gpt-5.2',
    settings=ModelSettings(temperature=0.8, max_tokens=500)  # Base defaults
)

# 2. Agent-level defaults (overrides model defaults by merging)
agent = Agent(model, model_settings=ModelSettings(temperature=0.5))

# 3. Run-time overrides (highest priority)
result_sync = agent.run_sync(
    'What is the capital of Italy?',
    model_settings=ModelSettings(temperature=0.0)  # Final temperature: 0.0
)
print(result_sync.output)
#> The capital of Italy is Rome.
```

The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence.

##### Dynamic model settings

Both agent-level and run-level `model_settings` accept a callable that receives a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings). The callable is invoked before each model request, so settings can vary per step. The current resolved settings so far are available via `ctx.model_settings` inside the callable.

Settings are resolved in layers, each merged on top of the previous:

1.  **Model defaults** (`model.settings`)
2.  **Agent-level** (`Agent(model_settings=...)`)
3.  **Capability-level** (e.g. from [`Thinking()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) -- see [Capabilities](/docs/ai/core-concepts/capabilities#providing-model-settings))
4.  **Run-level** (`agent.run(model_settings=...)`)

Inside a callable, `ctx.model_settings` contains the merged result of all _previous_ layers (position-dependent). For example, an agent-level callable sees only model defaults, while a run-level callable sees model defaults + agent-level + capability-level settings. To reset a field set by a previous layer, set it explicitly (e.g. `{'temperature': None}`).

```python
from pydantic_ai import Agent, ModelSettings

agent = Agent(
    'test',
    model_settings=lambda ctx: ModelSettings(
        temperature=0.0 if ctx.run_step <= 1 else 0.7,
    ),
)
```

Model Settings Support

Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like [`FallbackModel`](/docs/ai/models/overview#fallback-model), [`WrapperModel`](/docs/ai/api/models/wrapper/#pydantic_ai.models.wrapper.WrapperModel), and [`InstrumentedModel`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentedModel) don't have their own settings - they use the settings of their underlying models.

#### Run metadata

Run metadata lets you tag each agent execution with contextual details (for example, a tenant ID to filter traces and logs) and read it after completion via [`AgentRun.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun), [`AgentRunResult.metadata`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult), or [`StreamedRunResult.metadata`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult). The resolved metadata is attached to the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) during the run and, when instrumentation is enabled, added to the run span attributes for observability tools.

Configure metadata on an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) or pass it to a run. Both accept either a static dictionary or a callable that receives the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). Metadata is computed (if a callable) and applied when the run starts, then recomputed after a run ends successfully, so it can include end-of-run values. Agent-level metadata and per-run metadata are merged, with per-run values overriding agent-level ones.

run\_metadata.py

```python
from dataclasses import dataclass

from pydantic_ai import Agent


@dataclass
class Deps:
    tenant: str


agent = Agent[Deps](
    'openai:gpt-5.2',
    deps_type=Deps,
    metadata=lambda ctx: {'tenant': ctx.deps.tenant},  # agent-level metadata
)

result = agent.run_sync(
    'What is the capital of France?',
    deps=Deps(tenant='tenant-123'),
    metadata=lambda ctx: {'num_requests': ctx.usage.requests},  # per-run metadata
)
print(result.output)
#> The capital of France is Paris.
print(result.metadata)
#> {'tenant': 'tenant-123', 'num_requests': 1}
```

#### Concurrency Limiting

You can limit the number of concurrent agent runs using the `max_concurrency` parameter. This is useful when you want to prevent overwhelming external resources or enforce rate limits when running many agent instances in parallel.

agent\_concurrency.py

```python
import asyncio

from pydantic_ai import Agent, ConcurrencyLimit

# Simple limit: allow up to 10 concurrent runs
agent = Agent('openai:gpt-5', max_concurrency=10)


# With backpressure: limit concurrent runs and queue depth
agent_with_backpressure = Agent(
    'openai:gpt-5',
    max_concurrency=ConcurrencyLimit(max_running=10, max_queued=100),
)


async def main():
    # These will be rate-limited to 10 concurrent runs
    results = await asyncio.gather(
        *[agent.run(f'Question {i}') for i in range(20)]
    )
    print(len(results))
    #> 20
```

When the concurrency limit is reached, additional calls to [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) will wait until a slot becomes available. If you configure `max_queued` and the queue fills up, a [`ConcurrencyLimitExceeded`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ConcurrencyLimitExceeded) exception is raised.

When instrumentation is enabled, waiting operations appear as "waiting for concurrency" spans with attributes showing queue depth and limits.

### Model specific settings

If you wish to further customize model behavior, you can use a subclass of [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings), like [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings), associated with your model of choice.

For example:

```python
from pydantic_ai import Agent, UnexpectedModelBehavior
from pydantic_ai.models.google import GoogleModelSettings

agent = Agent('google-gla:gemini-3-flash-preview')

try:
  result = agent.run_sync(
      'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:',
      model_settings=GoogleModelSettings(
          temperature=0.0,  # general model settings can also be specified
          gemini_safety_settings=[
              {
                  'category': 'HARM_CATEGORY_HARASSMENT',
                  'threshold': 'BLOCK_LOW_AND_ABOVE',
              },
              {
                  'category': 'HARM_CATEGORY_HATE_SPEECH',
                  'threshold': 'BLOCK_LOW_AND_ABOVE',
              },
          ],
      ),
  )
except UnexpectedModelBehavior as e:
  print(e)  # (1)
  """
  Content filter 'SAFETY' triggered, body:
  <safety settings details>
  """
```

This error is raised because the safety thresholds were exceeded.

## Runs vs. Conversations

An agent **run** might represent an entire conversation -- there's no limit to how many messages can be exchanged in a single run. However, a **conversation** might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls.

Here's an example of a conversation comprised of multiple runs:

conversation\_example.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.output)
#> Albert Einstein was a German-born theoretical physicist.

# Second run, passing previous messages
result2 = agent.run_sync(
  'What was his most famous equation?',
  message_history=result1.new_messages(),  # (1)
)
print(result2.output)
#> Albert Einstein's most famous equation is (E = mc^2).
```

Continue the conversation; without `message_history` the model would not know who "his" was referring to.

_(This example is complete, it can be run "as is")_

## Type safe by design

Pydantic AI is designed to work well with static type checkers, like mypy and pyright.

Typing is (somewhat) optional

Pydantic AI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time.

That said, because Pydantic AI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `output_type` arguments to [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent)) are used at runtime.

We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you!

In particular, agents are generic in both the type of their dependencies and the type of the outputs they return, so you can use the type hints to ensure you're using the right types.

Consider the following script with type mistakes:

type\_mistakes.py

```python
from dataclasses import dataclass

from pydantic_ai import Agent, RunContext


@dataclass
class User:
  name: str


agent = Agent(
  'test',
  deps_type=User,  # (1)
  output_type=bool,
)


@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str:  # (2)
  return f"The user's name is {ctx.deps}."


def foobar(x: bytes) -> None:
  pass


result = agent.run_sync('Does their name start with "A"?', deps=User('Anne'))
foobar(result.output)  # (3)
```

The agent is defined as expecting an instance of `User` as `deps`.

But here `add_user_name` is defined as taking a `str` as the dependency, not a `User`.

Since the agent is defined as returning a `bool`, this will raise a type error since `foobar` expects `bytes`.

Running `mypy` on this will give the following output:

Terminal

```bash
➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]"  [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes"  [arg-type]
Found 2 errors in 1 file (checked 1 source file)
```

Running `pyright` would identify the same issues.

## System Prompts

System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want.

Tip

For most use cases, you should use `instructions` instead of "system prompts".

If you know what you are doing though and want to preserve system prompt messages in the message history sent to the LLM in subsequent completions requests, you can achieve this using the `system_prompt` argument/decorator.

See the section below on [Instructions](#instructions) for more information.

Generally, system prompts fall into two categories:

1.  **Static system prompts**: These are known when writing the code and can be defined via the `system_prompt` parameter of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__).
2.  **Dynamic system prompts**: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with [`@agent.system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt).

You can add both to a single agent; they're appended in the order they're defined at runtime.

Here's an example using both types of system prompts:

system\_prompts.py

```python
from datetime import date

from pydantic_ai import Agent, RunContext

agent = Agent(
  'openai:gpt-5.2',
  deps_type=str,  # (1)
  system_prompt="Use the customer's name while replying to them.",  # (2)
)


@agent.system_prompt  # (3)
def add_the_users_name(ctx: RunContext[str]) -> str:
  return f"The user's name is {ctx.deps}."


@agent.system_prompt
def add_the_date() -> str:  # (4)
  return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.output)
#> Hello Frank, the date today is 2032-01-02.
```

The agent expects a string dependency.

Static system prompt defined at agent creation time.

Dynamic system prompt defined via a decorator with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run.

Another dynamic system prompt, system prompts don't have to have the `RunContext` parameter.

_(This example is complete, it can be run "as is")_

## Instructions

Instructions are similar to system prompts. The main difference is that when an explicit `message_history` is provided in a call to `Agent.run` and similar methods, _instructions_ from any existing messages in the history are not included in the request to the model -- only the instructions of the _current_ agent are included.

You should use:

-   `instructions` when you want your request to the model to only include system prompts for the _current_ agent
-   `system_prompt` when you want your request to the model to _retain_ the system prompts used in previous requests (possibly made using other agents)

In general, we recommend using `instructions` instead of `system_prompt` unless you have a specific reason to use `system_prompt`.

Instructions, like system prompts, can be specified at different times:

1.  **Static instructions**: These are known when writing the code and can be defined via the `instructions` parameter of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__).
2.  **Dynamic instructions**: These rely on context that is only available at runtime and should be defined using functions decorated with [`@agent.instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions). Unlike dynamic system prompts, which may be reused when `message_history` is present, dynamic instructions are always reevaluated.
3.  **Runtime instructions**: These are additional instructions for a specific run that can be passed to one of the [run methods](#running-agents) using the `instructions` argument.

All three types of instructions can be added to a single agent, and they are appended in the order they are defined at runtime. Each instruction is internally classified as either **static** (literal strings from the `instructions` parameter) or **dynamic** (from `@agent.instructions` functions, runtime instructions, or [toolset](/docs/ai/tools-toolsets/toolsets) instructions). Static instructions are always sorted before dynamic ones. This ordering enables providers that support prompt caching (like [Anthropic](/docs/ai/models/anthropic#smart-instruction-caching) and [Bedrock](/docs/ai/models/bedrock#prompt-caching)) to cache the stable static prefix while leaving dynamic instructions outside the cache boundary.

Here's an example using a static instruction as well as dynamic instructions:

instructions.py

```python
from datetime import date

from pydantic_ai import Agent, RunContext

agent = Agent(
  'openai:gpt-5.2',
  deps_type=str,  # (1)
  instructions="Use the customer's name while replying to them.",  # (2)
)


@agent.instructions  # (3)
def add_the_users_name(ctx: RunContext[str]) -> str:
  return f"The user's name is {ctx.deps}."


@agent.instructions
def add_the_date() -> str:  # (4)
  return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.output)
#> Hello Frank, the date today is 2032-01-02.
```

The agent expects a string dependency.

Static instructions defined at agent creation time.

Dynamic instructions defined via a decorator with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run.

Another dynamic instruction, instructions don't have to have the `RunContext` parameter.

_(This example is complete, it can be run "as is")_

Note that returning an empty string will result in no instruction message added.

Instructions can also come from [capabilities](/docs/ai/core-concepts/capabilities) via [`get_instructions()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions), or from [template strings](/docs/ai/core-concepts/agent-spec#template-strings) rendered against the agent's dependencies.

## Reflection and self-correction

Validation errors from both function tool parameter validation and [structured output validation](/docs/ai/core-concepts/output#structured-output) can be passed back to the model with a request to retry.

You can also raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) from within a [tool](/docs/ai/tools-toolsets/tools) or [output function](/docs/ai/core-concepts/output#output-functions) to tell the model it should retry generating a response.

-   The default retry count is **1** but can be altered for the [entire agent](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), a [specific tool](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), or [outputs](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). The agent-level `output_retries` can also be overridden per run via `agent.run(output_retries=...)` and friends.
-   You can access the current retry count from within a tool, output validator, or output function via [`ctx.retry`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.retry).

### How output retries are enforced

Pydantic AI enforces `output_retries` differently depending on how the model returns its final output:

-   **Text output path** (`output_type=str`, text-only outputs, empty or unusable model responses): a single global budget of size `output_retries` is shared across the whole run. Each invalid response consumes one unit of the budget; when it's exhausted, the run raises [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) with message `'Exceeded maximum output retries (N)'`.
-   **Tool output path** ([`output_type=ToolOutput(...)`](/docs/ai/core-concepts/output#tool-output), structured outputs): `output_retries` is the _default per-tool limit_. See [Tool Output](/docs/ai/core-concepts/output#tool-output) for per-tool overrides via [`ToolOutput(max_retries=N)`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput.max_retries).

For how the budget appears inside [output validators](/docs/ai/core-concepts/output#output-validator-functions) -- including what `ctx.max_retries` and `ctx.retry` reflect on each path -- see the [Output validators](/docs/ai/core-concepts/output#output-validator-functions) section.

Tool retries are tracked per tool -- see [Tool Execution and Retries](/docs/ai/tools-toolsets/tools-advanced#tool-retries) for the per-tool counter model and the three configuration levels.

Here's an example:

tool\_retry.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry

from fake_database import DatabaseConn


class ChatResult(BaseModel):
    user_id: int
    message: str


agent = Agent(
    'openai:gpt-5.2',
    deps_type=DatabaseConn,
    output_type=ChatResult,
)


@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
    """Get a user's ID from their full name."""
    print(name)
    #> John
    #> John Doe
    user_id = ctx.deps.users.get(name=name)
    if user_id is None:
        raise ModelRetry(
            f'No user found with name {name!r}, remember to provide their full name'
        )
    return user_id


result = agent.run_sync(
    'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.output)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!'
"""
```

## Debugging and Monitoring

Agents require a different approach to observability than traditional software. With traditional web endpoints or data pipelines, you can largely predict behavior by reading the code. With agents, this is much harder. The model's decisions are stochastic, and that stochasticity compounds through the agentic loop as the agent reasons, calls tools, observes results, and reasons again. You need to actually see what happened.

This means setting up your application to record what's happening in a way you can review afterward, both during development (to understand and iterate) and in production (to debug issues and monitor behavior). The ergonomics matter too: a plaintext dump of everything that happened isn't a practical way to review agent behavior, even during development. You want tooling that lets you step through each decision and tool call interactively.

We recommend [Pydantic Logfire](https://logfire.pydantic.dev/docs/), which has been designed with Pydantic AI workflows in mind.

### Tracing with Logfire

```python
import logfire

logfire.configure()
logfire.instrument_pydantic_ai()
```

With Logfire instrumentation enabled, every agent run creates a detailed trace showing:

-   **Messages exchanged** with the model (system, user, assistant)
-   **Tool calls** including arguments and return values
-   **Token usage** per request and cumulative
-   **Latency** for each operation
-   **Errors** with full context

This visibility is invaluable for:

-   Understanding why an agent made a specific decision
-   Debugging unexpected behavior
-   Optimizing performance and costs
-   Monitoring production deployments

### Systematic Testing with Evals

For systematic evaluation of agent behavior beyond runtime debugging, [Pydantic Evals](/docs/ai/evals/) provides a code-first framework for testing AI systems:

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='agent_eval',
    cases=[
        Case(name='capital_question', inputs='What is the capital of France?', expected_output='Paris'),
    ]
)
report = dataset.evaluate_sync(my_agent_function)
```

Evals let you define test cases, run them against your agent, and score the results. When combined with Logfire, evaluation results appear in the web UI for visualization and comparison across runs. See the [Logfire integration guide](/docs/ai/evals/how-to/logfire-integration) for setup.

### Using Other Backends

Pydantic AI's instrumentation is built on [OpenTelemetry](https://opentelemetry.io/), so you can send traces to any compatible backend. Even if you use the Logfire SDK for its convenience, you can configure it to send data to other backends. See [alternative backends](/docs/ai/integrations/logfire#using-opentelemetry) for setup instructions.

[Full Logfire integration guide →](/docs/ai/integrations/logfire)

## Model errors

If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns `503`), agent runs will raise [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior).

In these cases, [`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) can be used to access the messages exchanged during the run to help diagnose the issue.

agent\_model\_errors.py

```python
from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages

agent = Agent('openai:gpt-5.2')


@agent.tool_plain
def calc_volume(size: int) -> int:  # (1)
  if size == 42:
      return size**3
  else:
      raise ModelRetry('Please try again.')


with capture_run_messages() as messages:  # (2)
  try:
      result = agent.run_sync('Please get me the volume of a box with size 6.')
  except UnexpectedModelBehavior as e:
      print('An error occurred:', e)
      #> An error occurred: Tool 'calc_volume' exceeded max retries count of 1
      print('cause:', repr(e.__cause__))
      #> cause: ModelRetry('Please try again.')
      print('messages:', messages)
      """
      messages:
      [
          ModelRequest(
              parts=[
                  UserPromptPart(
                      content='Please get me the volume of a box with size 6.',
                      timestamp=datetime.datetime(...),
                  )
              ],
              timestamp=datetime.datetime(...),
              run_id='...',
              conversation_id='...',
          ),
          ModelResponse(
              parts=[
                  ToolCallPart(
                      tool_name='calc_volume',
                      args={'size': 6},
                      tool_call_id='pyd_ai_tool_call_id',
                  )
              ],
              usage=RequestUsage(input_tokens=62, output_tokens=4),
              model_name='gpt-5.2',
              timestamp=datetime.datetime(...),
              run_id='...',
              conversation_id='...',
          ),
          ModelRequest(
              parts=[
                  RetryPromptPart(
                      content='Please try again.',
                      tool_name='calc_volume',
                      tool_call_id='pyd_ai_tool_call_id',
                      timestamp=datetime.datetime(...),
                  )
              ],
              timestamp=datetime.datetime(...),
              run_id='...',
              conversation_id='...',
          ),
          ModelResponse(
              parts=[
                  ToolCallPart(
                      tool_name='calc_volume',
                      args={'size': 6},
                      tool_call_id='pyd_ai_tool_call_id',
                  )
              ],
              usage=RequestUsage(input_tokens=72, output_tokens=8),
              model_name='gpt-5.2',
              timestamp=datetime.datetime(...),
              run_id='...',
              conversation_id='...',
          ),
      ]
      """
  else:
      print(result.output)
```

Define a tool that will raise `ModelRetry` repeatedly in this case.

[`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) is used to capture the messages exchanged during the run.

_(This example is complete, it can be run "as is")_

Note

If you call [`run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), or [`run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only.

## Agent Specs

Agents can also be defined declaratively in YAML or JSON using [agent specs](/docs/ai/core-concepts/agent-spec). This separates agent configuration from application code:

```yaml
model: anthropic:claude-opus-4-6
instructions: You are a helpful assistant.
capabilities:
  - WebSearch
  - Thinking:
      effort: high
```

```python
from pydantic_ai import Agent

agent = Agent.from_file('agent.yaml')
```

See [Agent Specs](/docs/ai/core-concepts/agent-spec) for the full spec format, template strings, and custom capability registration.

---

# [Agent Specs](https://pydantic.dev/docs/ai/core-concepts/agent-spec/)

# Agent Specs

Agent specs let you define agents declaratively in YAML or JSON -- [model](/docs/ai/models/overview), [instructions](/docs/ai/core-concepts/agent#instructions), [capabilities](/docs/ai/core-concepts/capabilities), and all. One line to load, no Python agent construction code required.

This is useful for:

-   Separating agent configuration from application code
-   Letting non-developers (prompt engineers, domain experts) configure agents
-   Storing agent definitions alongside other config files
-   Sharing agent configurations across teams or projects

## Defining a spec

A spec file defines the agent's configuration in YAML or JSON:

agent.yaml

```yaml
model: anthropic:claude-opus-4-6
instructions: You are a helpful research assistant.
model_settings:
  max_tokens: 8192
capabilities:
  - WebSearch
  - Thinking:
      effort: high
```

## Loading specs

[`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file) loads a spec from a YAML or JSON file and constructs an agent:

from\_file\_example.py

```python
from pydantic_ai import Agent

agent = Agent.from_file('agent.yaml')
```

[`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) accepts a dict or `AgentSpec` instance and supports additional keyword arguments that supplement or override the spec:

from\_spec\_example.py

```python
from dataclasses import dataclass

from pydantic_ai import Agent


@dataclass
class UserContext:
    user_name: str


agent = Agent.from_spec(
    {
        'model': 'anthropic:claude-opus-4-6',
        'instructions': 'You are helping {{user_name}}.',
        'capabilities': ['WebSearch'],
    },
    deps_type=UserContext,
)
```

Keyword arguments interact with spec fields as follows:

-   **Scalar fields** (`model`, `name`, `output_retries`, `end_strategy`, etc.) -- the keyword argument overrides the spec value when provided.
-   **`instructions`** -- merged: spec instructions come first, then keyword argument instructions.
-   **`capabilities`** -- merged: spec capabilities come first, then keyword argument capabilities.
-   **`model_settings`** -- merged additively: keyword argument settings override matching spec settings.
-   **`output_type`** -- takes precedence over `output_schema` from the spec.

When `deps_type` is passed, [template strings](#template-strings) in the spec's `instructions`, `description`, and capability arguments are compiled and validated against the deps type at construction time.

For more control over spec loading, use `AgentSpec.from_file` to load the spec separately before passing it to `Agent.from_spec`.

## Template strings

`TemplateStr` provides Handlebars-style templates (`{{variable}}`) that are rendered against the agent's [dependencies](/docs/ai/core-concepts/dependencies) at runtime. In spec files, strings containing `{{` are automatically converted to template strings:

```yaml
instructions: "You are assisting {{name}}, who is a {{role}}."
```

Template variables are resolved from the fields of the `deps` object. When a `deps_type` (or [`deps_schema`](#deps_schema)) is provided, template variable names are validated at construction time.

In Python code, `TemplateStr` can be used explicitly, but a callable with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is generally preferred for IDE autocomplete and type checking:

template\_instructions.py

```python
from dataclasses import dataclass

from pydantic_ai import Agent, TemplateStr


@dataclass
class UserProfile:
    name: str
    role: str


agent = Agent(
    'openai:gpt-5.2',
    deps_type=UserProfile,
    instructions=TemplateStr('You are assisting {{name}}, who is a {{role}}.'),
)
result = agent.run_sync('hello', deps=UserProfile(name='Alice', role='engineer'))
print(result.output)
#> Hello! How can I help you today?
```

## Capability spec syntax

Capabilities in specs support three forms:

-   `'MyCapability'` -- no arguments, calls `MyCapability.from_spec()`
-   `{'MyCapability': value}` -- single positional argument, calls `MyCapability.from_spec(value)`
-   `{'MyCapability': {key: value, ...}}` -- keyword arguments, calls `MyCapability.from_spec(**kwargs)`

## Custom capabilities in specs

See [Publishing capabilities](/docs/ai/core-concepts/capabilities#publishing-capabilities) for how to make custom capabilities work with agent specs.

## `AgentSpec` reference

The `AgentSpec` model represents the full spec structure:

Field

Type

Description

`model`

`str`

[Model](/docs/ai/models/overview) name (required)

`name`

`str | None`

Agent name

`description`

`str | None`

Agent description (supports [templates](#template-strings))

`instructions`

`str | list[str] | None`

[Instructions](/docs/ai/core-concepts/agent#instructions) (supports [templates](#template-strings))

`model_settings`

`dict | None`

[Model settings](/docs/ai/core-concepts/agent#model-run-settings)

`capabilities`

`list`

[Capabilities](/docs/ai/core-concepts/capabilities) (see [spec syntax](#capability-spec-syntax))

`deps_schema`

`dict | None`

JSON Schema for [template string](#template-strings) validation (see below)

`output_schema`

`dict | None`

JSON Schema for [structured output](/docs/ai/core-concepts/output) (see below)

`tool_retries`

`int | None`

Default [tool retries](/docs/ai/tools-toolsets/tools-advanced#tool-retries) (default: `None`, treated as `1`)

`output_retries`

`int | None`

[Output validation retries](/docs/ai/core-concepts/output#output-validator-functions)

`end_strategy`

`EndStrategy`

When to stop (`'early'` or `'exhaustive'`)

`tool_timeout`

`float | None`

Default [tool](/docs/ai/tools-toolsets/tools) timeout in seconds

`instrument`

`bool | None`

Enable [Logfire](/docs/ai/integrations/logfire) instrumentation

`metadata`

`dict | None`

Agent [metadata](/docs/ai/core-concepts/agent#run-metadata)

### `deps_schema`

When loading a spec file without a Python `deps_type`, `deps_schema` provides a JSON Schema that validates [template string](#template-strings) variable names at construction time. It does **not** validate the actual deps object at runtime -- it only ensures that template variables like `{{user_name}}` correspond to properties defined in the schema.

### `output_schema`

When provided (and no `output_type` keyword argument is passed to `from_spec`), `output_schema` defines the structure the model should produce as its final output. Under the hood, it creates a [`StructuredDict`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.StructuredDict) output type: the JSON Schema is sent to the model API so the model knows what structure to produce, and the response is returned as a `dict[str, Any]`.

Note

The model's response is not validated against the schema's `properties` or `required` fields -- it is accepted as a plain dict. The schema serves as an instruction to the model, not a runtime validation constraint.

agent\_with\_schema.yaml

```yaml
model: anthropic:claude-opus-4-6
deps_schema:
  type: object
  properties:
    user_name:
      type: string
  required: [user_name]
output_schema:
  type: object
  properties:
    answer:
      type: string
    confidence:
      type: number
  required: [answer, confidence]
instructions: "You are helping {{user_name}}. Always include a confidence score."
capabilities:
  - WebSearch
```

## Saving specs

`AgentSpec.to_file` saves a spec to YAML or JSON and optionally generates a companion JSON Schema file for editor autocompletion:

save\_spec\_example.py

```python
from pydantic_ai import AgentSpec

spec = AgentSpec(
    model='anthropic:claude-opus-4-6',
    instructions='You are a helpful assistant.',
    capabilities=['WebSearch'],
)
spec.to_file('agent.yaml')
# Also generates ./agent_schema.json for editor autocompletion
```

The generated JSON Schema file enables autocompletion and validation in editors that support the [YAML Language Server](https://github.com/redhat-developer/yaml-language-server) protocol. Pass `schema_path=None` to skip schema generation.

---

# [Capabilities](https://pydantic.dev/docs/ai/core-concepts/capabilities/)

# Capabilities

A capability is a reusable, composable unit of agent behavior. Instead of threading multiple arguments through your `Agent` constructor -- [instructions](/docs/ai/core-concepts/agent#instructions) here, [model settings](/docs/ai/core-concepts/agent#model-run-settings) there, a [toolset](/docs/ai/tools-toolsets/toolsets) somewhere else, a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) on yet another parameter -- you can bundle related behavior into a single capability and pass it via the [`capabilities`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) parameter.

Capabilities can provide any combination of:

-   **Tools** -- via [toolsets](/docs/ai/tools-toolsets/toolsets) or [builtin tools](/docs/ai/tools-toolsets/builtin-tools)
-   **Lifecycle hooks** -- intercept and modify model requests, tool calls, and the overall run
-   **Instructions** -- static or dynamic [instruction](/docs/ai/core-concepts/agent#instructions) additions
-   **Model settings** -- static or per-step [model settings](/docs/ai/core-concepts/agent#model-run-settings)

This makes them the primary extension point for Pydantic AI. Whether you're building a memory system, a guardrail, a cost tracker, or an approval workflow, a capability is the right abstraction.

## Built-in capabilities

Pydantic AI ships with several capabilities that cover common needs:

Capability

What it provides

Spec

[`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking)

Enables model [thinking/reasoning](/docs/ai/advanced-features/thinking) at configurable effort

Yes

[`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks)

Decorator-based [lifecycle hook](/docs/ai/core-concepts/hooks) registration

--

[`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch)

Web search -- builtin when supported, [local fallback](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool) with [`duckduckgo` extra](/docs/ai/overview/install#slim-install)

Yes

[`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch)

URL fetching -- builtin when supported, [local fallback](/docs/ai/tools-toolsets/common-tools#web-fetch-tool) with [`web-fetch` extra](/docs/ai/overview/install#slim-install)

Yes

[`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration)

Image generation -- builtin when supported, subagent fallback via `fallback_model`

Yes

[`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP)

MCP server -- builtin when supported, direct connection otherwise

Yes

[`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools)

Filters or modifies function [tool definitions](/docs/ai/tools-toolsets/tools) per step

--

[`PrepareOutputTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareOutputTools)

Filters or modifies [output tool](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) definitions per step

--

[`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools)

Wraps a capability and prefixes its tool names

Yes

[`BuiltinTool`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.BuiltinTool)

Registers a [builtin tool](/docs/ai/tools-toolsets/builtin-tools) with the agent

Yes

[`Toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Toolset)

Wraps an [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset)

--

[`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas)

Includes return type schemas in tool definitions sent to the model

Yes

[`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata)

Merges metadata key-value pairs onto selected tools

Yes

[`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls)

Resolves [deferred tool calls](/docs/ai/tools-toolsets/deferred-tools#resolving-deferred-calls-with-a-handler) inline with a handler function

--

[`ProcessHistory`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ProcessHistory)

Wraps a [history processor](/docs/ai/core-concepts/message-history#processing-message-history)

--

[`ProcessEventStream`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ProcessEventStream)

Forwards agent stream events to a handler function

--

[`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor)

Uses a custom thread executor for [sync functions](/docs/ai/tools-toolsets/tools-advanced#thread-executor-for-long-running-servers)

--

The **Spec** column indicates whether the capability can be used in [agent specs](/docs/ai/core-concepts/agent-spec) (YAML/JSON). Capabilities marked **--** take non-serializable arguments (callables, toolset objects) and can only be used in Python code.

builtin\_capabilities.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import Thinking, WebSearch

agent = Agent(
    'anthropic:claude-opus-4-6',
    instructions='You are a research assistant. Be thorough and cite sources.',
    capabilities=[
        Thinking(effort='high'),
        WebSearch(),
    ],
)
```

[Instructions](/docs/ai/core-concepts/agent#instructions) and [model settings](/docs/ai/core-concepts/agent#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or `AgentSpec`). Capabilities are for behavior that goes beyond simple configuration -- tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](/docs/ai/core-concepts/agent-spec).

### Thinking

The [`Thinking`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Thinking) capability enables model [thinking/reasoning](/docs/ai/advanced-features/thinking) at a configurable effort level. It's the simplest way to enable thinking across providers:

thinking\_capability.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import Thinking

agent = Agent('anthropic:claude-sonnet-4-6', capabilities=[Thinking(effort='high')])
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

See [Thinking](/docs/ai/advanced-features/thinking) for provider-specific details and the [unified thinking settings](/docs/ai/advanced-features/thinking#unified-thinking-settings).

### Compaction

Provider-specific compaction capabilities manage conversation context size by compacting older messages into summaries:

Provider

Capability

Details

OpenAI Responses API

[`OpenAICompaction`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction)

[OpenAI compaction](/docs/ai/models/openai#message-compaction)

Anthropic

[`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction)

[Anthropic compaction](/docs/ai/models/anthropic#message-compaction)

### ThreadExecutor

The [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability provides a custom [`Executor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Executor) for running sync tool functions and other sync callbacks in threads. This is useful in long-running servers (e.g. FastAPI) where the default ephemeral threads from `anyio.to_thread.run_sync` can accumulate under sustained load:

```python
from concurrent.futures import ThreadPoolExecutor

from pydantic_ai import Agent
from pydantic_ai.capabilities import ThreadExecutor

executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker')
agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)])
```

See [Thread executor for long-running servers](/docs/ai/tools-toolsets/tools-advanced#thread-executor-for-long-running-servers) for more details.

### Hooks

The [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) capability provides decorator-based [lifecycle hook](#hooking-into-the-lifecycle) registration -- the easiest way to intercept model requests, tool calls, and other events without subclassing [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability):

```python
from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import Hooks

hooks = Hooks()

@hooks.on.before_model_request
async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext:
    agent_name = ctx.agent.name if ctx.agent else 'unknown'
    print(f'[{agent_name}] Sending {len(request_context.messages)} messages')
    return request_context

agent = Agent('openai:gpt-5.2', name='my_agent', capabilities=[hooks])
```

All hooks receive [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), which provides access to the running agent via [`ctx.agent`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.agent) -- useful for logging, metrics, and other cross-cutting concerns that need to identify which agent is running.

See the dedicated [Hooks](/docs/ai/core-concepts/hooks) page for the full API: decorator and constructor registration, timeouts, tool filtering, wrap hooks, per-event hooks, and more.

### Provider-adaptive tools

[`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration), and [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP) provide model-agnostic access to common tool types. When the model supports the tool natively (as a [builtin tool](/docs/ai/tools-toolsets/builtin-tools)), it's used directly. When it doesn't, a local function tool handles it instead -- so your agent works across providers without code changes.

Each accepts `builtin` and `local` keyword arguments to control which side is used:

provider\_adaptive\_tools.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import MCP, ImageGeneration, WebFetch, WebSearch

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    capabilities=[
        # Auto-detects DuckDuckGo as local fallback
        WebSearch(),
        # Auto-detects markdownify-based fetch as local fallback
        WebFetch(),
        # Builtin when the model supports it; falls back to a subagent
        # running the specified LLM with image generation support
        ImageGeneration(fallback_model='openai-responses:gpt-5.4'),
        # Auto-detects transport from URL
        MCP(url='https://mcp.example.com/api'),
    ],
)
```

To force builtin-only (errors on unsupported models instead of falling back to local):

builtin\_only.py

```python
MCP(url='https://mcp.example.com/api', local=False)
```

To force local-only (never use the builtin, even when the model supports it):

local\_only.py

```python
MCP(url='https://mcp.example.com/api', builtin=False)
```

Some constraint fields require the builtin because the local fallback can't enforce them. When these are set and the model doesn't support the builtin, a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) is raised. For example, [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch) domain constraints require the builtin, while [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) enforces them locally:

constraints.py

```python
# Only search example.com -- requires builtin support
WebSearch(allowed_domains=['example.com'])

# Only fetch example.com -- enforced locally when builtin is unavailable
WebFetch(allowed_domains=['example.com'])
```

All of these capabilities are subclasses of [`BuiltinOrLocalTool`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.BuiltinOrLocalTool), which you can use directly or subclass to build your own provider-adaptive tools. For example, to pair [`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool) with a local fallback:

custom\_builtin\_or\_local.py

```python
from pydantic_ai.builtin_tools import CodeExecutionTool
from pydantic_ai.capabilities import BuiltinOrLocalTool

cap = BuiltinOrLocalTool(builtin=CodeExecutionTool(), local=my_local_executor)
```

### PrepareTools and PrepareOutputTools

[`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) and [`PrepareOutputTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareOutputTools) wrap a [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) as a capability, for filtering or modifying [tool definitions](/docs/ai/tools-toolsets/tools) per step. `PrepareTools` handles function tools; `PrepareOutputTools` handles [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput). The Agent constructor's [`prepare_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) / [`prepare_output_tools`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) arguments are sugar that injects these capabilities automatically.

prepare\_tools\_builtin.py

```python
from pydantic_ai import Agent, RunContext, ToolDefinition
from pydantic_ai.capabilities import PrepareTools


async def hide_dangerous(ctx: RunContext[None], tool_defs: list[ToolDefinition]) -> list[ToolDefinition]:
    return [td for td in tool_defs if not td.name.startswith('delete_')]


agent = Agent('openai:gpt-5.2', capabilities=[PrepareTools(hide_dangerous)])


@agent.tool_plain
def delete_file(path: str) -> str:
    """Delete a file."""
    return f'deleted {path}'


@agent.tool_plain
def read_file(path: str) -> str:
    """Read a file."""
    return f'contents of {path}'


result = agent.run_sync('hello')
# The model only sees `read_file`, not `delete_file`
```

For more complex tool preparation logic, see [Tool preparation](#tool-preparation) under lifecycle hooks.

### PrefixTools

[`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) wraps another capability and prefixes all of its tool names, useful for namespacing when composing multiple capabilities that might have conflicting tool names:

prefix\_tools\_example.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import MCP, PrefixTools

agent = Agent(
    'openai:gpt-5.2',
    capabilities=[
        PrefixTools(MCP(url='https://api1.example.com'), prefix='api1'),
        PrefixTools(MCP(url='https://api2.example.com'), prefix='api2'),
    ],
)
```

Every [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) has a convenience method [`prefix_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prefix_tools) that returns a [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) wrapper:

prefix\_convenience.py

```python
MCP(url='https://mcp.example.com/api').prefix_tools('mcp')
```

### IncludeToolReturnSchemas

[`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) includes return type schemas in tool definitions sent to the model. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured field in the API request. For other models, it is injected into the tool description as JSON text.

include\_return\_schemas.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import IncludeToolReturnSchemas
from pydantic_ai.models.test import TestModel


test_model = TestModel()
agent = Agent(test_model, capabilities=[IncludeToolReturnSchemas()])


@agent.tool_plain
def get_temperature(city: str) -> float:
    """Get the temperature for a city."""
    return 21.0


result = agent.run_sync('What is the temperature in Paris?')
params = test_model.last_model_request_parameters
assert params is not None
td = params.function_tools[0]
assert td.include_return_schema is True
```

_(This example is complete, it can be run "as is")_

Use the `tools` parameter to select which tools should include return schemas. It accepts a list of tool names, a metadata dict for matching, or a callable predicate:

include\_return\_schemas\_selective.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import IncludeToolReturnSchemas
from pydantic_ai.models.test import TestModel


test_model = TestModel()
agent = Agent(
    test_model,
    capabilities=[IncludeToolReturnSchemas(tools=['get_temperature'])],
)


@agent.tool_plain
def get_temperature(city: str) -> float:
    """Get the temperature for a city."""
    return 21.0


@agent.tool_plain
def get_greeting(name: str) -> str:
    """Get a greeting."""
    return f'Hello, {name}!'


result = agent.run_sync('Hello')
params = test_model.last_model_request_parameters
assert params is not None
temp_tool = next(t for t in params.function_tools if t.name == 'get_temperature')
greet_tool = next(t for t in params.function_tools if t.name == 'get_greeting')
assert temp_tool.include_return_schema is True
assert greet_tool.include_return_schema is None
```

_(This example is complete, it can be run "as is")_

The same effect can be achieved at the toolset level using [`.include_return_schemas()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.include_return_schemas) -- see [toolset composition](/docs/ai/tools-toolsets/toolsets#including-return-schemas).

### SetToolMetadata

[`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata) merges metadata key-value pairs onto selected tools. This is useful for tagging tools with configuration that other capabilities or custom logic can inspect:

set\_tool\_metadata.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import SetToolMetadata
from pydantic_ai.models.test import TestModel


test_model = TestModel()
agent = Agent(
    test_model,
    capabilities=[SetToolMetadata(tools=['search'], sensitive=True)],
)


@agent.tool_plain
def search(query: str) -> str:
    """Search for information."""
    return f'Results for: {query}'


@agent.tool_plain
def greet(name: str) -> str:
    """Greet someone."""
    return f'Hello, {name}!'


result = agent.run_sync('Search for pydantic')
params = test_model.last_model_request_parameters
assert params is not None
search_tool = next(t for t in params.function_tools if t.name == 'search')
greet_tool = next(t for t in params.function_tools if t.name == 'greet')
assert search_tool.metadata is not None and search_tool.metadata.get('sensitive') is True
assert greet_tool.metadata is None or greet_tool.metadata.get('sensitive') is None
```

_(This example is complete, it can be run "as is")_

The same effect can be achieved at the toolset level using [`.with_metadata()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.with_metadata) -- see [toolset composition](/docs/ai/tools-toolsets/toolsets#setting-tool-metadata).

### ReinjectSystemPrompt

[`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) ensures the agent's configured [`system_prompt`](/docs/ai/core-concepts/agent#system-prompts) is at the head of the first [`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) on every model request. By default, if any [`SystemPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart) is already present in the history, the capability is a no-op (so multi-agent handoff and user-managed system prompts remain authoritative). Set `replace_existing=True` to instead strip any existing `SystemPromptPart`s before prepending the agent's configured prompt -- useful when the history comes from an untrusted source and the server's prompt must win.

Useful when `message_history` comes from a source that doesn't round-trip system prompts -- UI frontends, database persistence layers, conversation compaction pipelines. Without this capability, an agent configured with a `system_prompt` will silently run without it if the history doesn't already include one.

reinject\_system\_prompt.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import ReinjectSystemPrompt
from pydantic_ai.messages import ModelRequest, ModelResponse, TextPart, UserPromptPart

agent = Agent('test', system_prompt='You are a helpful assistant.', capabilities=[ReinjectSystemPrompt()])

# History that's missing the system prompt (e.g. reconstructed from a UI frontend).
history = [
    ModelRequest(parts=[UserPromptPart(content='Hi')]),
    ModelResponse(parts=[TextPart(content='Hello!')]),
]

# Without the capability, the agent would run without its configured system prompt.
# With the capability, the system prompt is reinjected at the head of the first request.
result = agent.run_sync('Follow up', message_history=history)
first_request = result.all_messages()[0]
assert isinstance(first_request, ModelRequest)
assert first_request.parts[0].content == 'You are a helpful assistant.'
```

_(This example is complete, it can be run "as is")_

The [UI adapters](/docs/ai/integrations/ui/ag-ui) (AG-UI, Vercel AI) automatically add this capability with `replace_existing=True` in their `manage_system_prompt='server'` mode.

## Building custom capabilities

To build your own capability, subclass [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) and override the methods you need. There are two categories: **configuration methods** that are called at agent construction (except [`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset) which is called per-run), and **lifecycle hooks** that fire during each run.

### Providing tools

A capability that provides tools returns a [toolset](/docs/ai/tools-toolsets/toolsets) from [`get_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_toolset). This can be a pre-built [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) instance, or a callable that receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns one dynamically:

custom\_capability\_tools.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.toolsets import AgentToolset, FunctionToolset

math_toolset = FunctionToolset()


@math_toolset.tool_plain
def add(a: float, b: float) -> float:
    """Add two numbers."""
    return a + b


@math_toolset.tool_plain
def multiply(a: float, b: float) -> float:
    """Multiply two numbers."""
    return a * b


@dataclass
class MathTools(AbstractCapability[Any]):
    """Provides basic math operations."""

    def get_toolset(self) -> AgentToolset[Any] | None:
        return math_toolset


agent = Agent('openai:gpt-5.2', capabilities=[MathTools()])
result = agent.run_sync('What is 2 + 3?')
print(result.output)
#> The answer is 5.0
```

For [builtin tools](/docs/ai/tools-toolsets/builtin-tools), override [`get_builtin_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_builtin_tools) to return a sequence of [`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool) instances (which includes both [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) objects and callables that receive [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)).

#### Toolset wrapping

[`get_wrapper_toolset`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset) lets a capability wrap the agent's entire assembled toolset with a [`WrapperToolset`](/docs/ai/tools-toolsets/toolsets#changing-tool-execution). This is more powerful than providing tools -- it can intercept tool execution, add logging, or apply cross-cutting behavior.

The wrapper receives the combined non-output toolset (after the [`prepare_tools`](#tool-preparation) hook has wrapped it). Output tools are added separately and are not affected.

wrapper\_toolset\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.toolsets import AbstractToolset
from pydantic_ai.toolsets.wrapper import WrapperToolset


@dataclass
class LoggingToolset(WrapperToolset[Any]):
    """Logs all tool calls."""

    async def call_tool(
        self, tool_name: str, tool_args: dict[str, Any], *args: Any, **kwargs: Any
    ) -> Any:
        print(f'  Calling tool: {tool_name}')
        return await super().call_tool(tool_name, tool_args, *args, **kwargs)


@dataclass
class LogToolCalls(AbstractCapability[Any]):
    """Wraps the agent's toolset to log all tool calls."""

    def get_wrapper_toolset(self, toolset: AbstractToolset[Any]) -> AbstractToolset[Any]:
        return LoggingToolset(wrapped=toolset)


agent = Agent('openai:gpt-5.2', capabilities=[LogToolCalls()])


@agent.tool_plain
def greet(name: str) -> str:
    """Greet someone."""
    return f'Hello, {name}!'


result = agent.run_sync('hello')
# Tool calls are logged as they happen
```

Note

`get_wrapper_toolset` wraps the non-output _toolset_ once per run (during toolset assembly). The [`prepare_tools`](#tool-preparation) and [`prepare_output_tools`](#tool-preparation) hooks also flow through `PreparedToolset` wrappers, so all three integrate at the toolset level -- `get_wrapper_toolset` runs around `prepare_tools` (it sees the prepared defs), and `prepare_output_tools` wraps the output toolset independently.

### Providing instructions

[`get_instructions`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions) adds [instructions](/docs/ai/core-concepts/agent#instructions) to the agent. Since it's called once at agent construction, return a callable if you need dynamic values:

custom\_capability\_config.py

```python
from dataclasses import dataclass
from datetime import datetime
from typing import Any

from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class KnowsCurrentTime(AbstractCapability[Any]):
    """Tells the agent what time it is."""

    def get_instructions(self):
        def _get_time(ctx: RunContext[Any]) -> str:
            return f'The current date and time is {datetime.now().isoformat()}.'

        return _get_time


agent = Agent('openai:gpt-5.2', capabilities=[KnowsCurrentTime()])
result = agent.run_sync('What time is it?')
print(result.output)
#> The current time is 3:45 PM.
```

Instructions can also use [template strings](/docs/ai/core-concepts/agent-spec#template-strings) (`TemplateStr('Hello {{name}}')`) for Handlebars-style templates rendered against the agent's [dependencies](/docs/ai/core-concepts/dependencies). In Python code, a callable with [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is generally preferred for IDE autocomplete.

### Providing model settings

[`get_model_settings`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings) returns [model settings](/docs/ai/core-concepts/agent#model-run-settings) as a dict or a callable for per-step settings.

When model settings need to vary per step -- for example, enabling thinking only on retry, or forcing a specific [`tool_choice`](/docs/ai/tools-toolsets/tools-advanced#dynamic-tool-choice-via-capabilities) until a tool has been called -- return a callable:

dynamic\_settings.py

```python
from dataclasses import dataclass

from pydantic_ai import Agent, ModelSettings, RunContext
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class ThinkingOnRetry(AbstractCapability[None]):
    """Enables thinking mode when the agent is retrying."""

    def get_model_settings(self):
        def resolve(ctx: RunContext[None]) -> ModelSettings:
            if ctx.run_step > 1:
                return ModelSettings(thinking='high')
            return ModelSettings()

        return resolve


agent = Agent('openai:gpt-5.2', capabilities=[ThinkingOnRetry()])
result = agent.run_sync('hello')
print(result.output)
#> Hello! How can I help you today?
```

The callable receives a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) where `ctx.model_settings` contains the merged result of all layers resolved before this capability (model defaults and agent-level settings).

### Configuration methods reference

Method

Return type

Purpose

[`get_toolset()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_toolset)

`AgentToolset` `| None`

A [toolset](/docs/ai/tools-toolsets/toolsets) to register (or a callable for [dynamic toolsets](/docs/ai/tools-toolsets/toolsets#dynamically-building-a-toolset))

[`get_builtin_tools()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_builtin_tools)

`Sequence[`[`AgentBuiltinTool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.AgentBuiltinTool)`]`

[Builtin tools](/docs/ai/tools-toolsets/builtin-tools) to register (including callables)

[`get_wrapper_toolset()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset)

[`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) `| None`

[Wrap the agent's assembled toolset](#toolset-wrapping)

[`get_instructions()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_instructions)

`AgentInstructions` `| None`

[Instructions](/docs/ai/core-concepts/agent#instructions) (static strings, [template strings](/docs/ai/core-concepts/agent-spec#template-strings), or callables)

[`get_model_settings()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings)

`AgentModelSettings` `| None`

[Model settings](/docs/ai/core-concepts/agent#model-run-settings) dict, or a callable for per-step settings

### Hooking into the lifecycle

Capabilities can hook into five lifecycle points, each with up to four variants:

-   **`before_*`** -- fires before the action, can modify inputs
-   **`after_*`** -- fires after the action succeeds (in reverse capability order), can modify outputs
-   **`wrap_*`** -- full middleware control: receives a `handler` callable and decides whether/how to call it
-   **`on_*_error`** -- fires when the action fails (after `wrap_*` has had its chance to recover), can observe, transform, or recover from errors

Tip

For quick, application-level hooks without subclassing, use the [`Hooks`](/docs/ai/core-concepts/hooks) capability instead.

#### Run hooks

Hook

Signature

Purpose

[`before_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_run)

`(ctx: RunContext) -> None`

Observe-only notification that a run is starting

[`after_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_run)

`(ctx: RunContext, *, result: AgentRunResult) -> AgentRunResult`

Modify the final result

[`wrap_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run)

`(ctx: RunContext, *, handler: WrapRunHandler) -> AgentRunResult`

Wrap the entire run

[`on_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_run_error)

`(ctx: RunContext, *, error: BaseException) -> AgentRunResult`

Handle run errors (see [error hooks](#error-hooks))

`wrap_run` supports error recovery: if `handler()` raises and `wrap_run` catches the exception and returns a result instead, the error is suppressed and the recovery result is used. This works with both [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) and [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter).

#### Node hooks

Hook

Signature

Purpose

[`before_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_node_run)

`(ctx: RunContext, *, node: AgentNode) -> AgentNode`

Observe or replace the node before execution

[`after_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_node_run)

`(ctx: RunContext, *, node: AgentNode, result: NodeResult) -> NodeResult`

Modify the result (next node or `End`)

[`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run)

`(ctx: RunContext, *, node: AgentNode, handler: WrapNodeRunHandler) -> NodeResult`

Wrap each graph node execution

[`on_node_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_node_run_error)

`(ctx: RunContext, *, node: AgentNode, error: BaseException) -> NodeResult`

Handle node errors (see [error hooks](#error-hooks))

[`wrap_node_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_node_run) fires for every node in the [agent graph](/docs/ai/core-concepts/agent#iterating-over-an-agents-graph) (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). Override this to observe node transitions, add per-step logging, or modify graph progression:

Note

`wrap_node_run` hooks are called automatically by [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), and [`agent_run.next()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next). However, they are **not** called when iterating with bare `async for node in agent_run:` over [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter), since that uses the graph run's internal iteration. Always use `agent_run.next(node)` to advance the run if you need `wrap_node_run` hooks to fire.

node\_logging\_example.py

```python
from __future__ import annotations

from dataclasses import dataclass, field
from typing import Any

from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import (
    AbstractCapability,
    AgentNode,
    NodeResult,
    WrapNodeRunHandler,
)


@dataclass
class NodeLogger(AbstractCapability[Any]):
    """Logs each node that executes during a run."""

    nodes: list[str] = field(default_factory=list)

    async def wrap_node_run(
        self, ctx: RunContext[Any], *, node: AgentNode[Any], handler: WrapNodeRunHandler[Any]
    ) -> NodeResult[Any]:
        self.nodes.append(type(node).__name__)
        return await handler(node)


logger = NodeLogger()
agent = Agent('openai:gpt-5.2', capabilities=[logger])
agent.run_sync('hello')
print(logger.nodes)
#> ['UserPromptNode', 'ModelRequestNode', 'CallToolsNode']
```

You can also use `wrap_node_run` to modify graph progression -- for example, limiting the number of model requests per run:

node\_modification\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_graph import End

from pydantic_ai import ModelRequestNode, RunContext
from pydantic_ai.capabilities import AbstractCapability, AgentNode, NodeResult, WrapNodeRunHandler
from pydantic_ai.result import FinalResult


@dataclass
class MaxModelRequests(AbstractCapability[Any]):
    """Limits the number of model requests per run by ending early."""

    max_requests: int = 5
    count: int = 0

    async def for_run(self, ctx: RunContext[Any]) -> 'MaxModelRequests':
        return MaxModelRequests(max_requests=self.max_requests)  # fresh per run

    async def wrap_node_run(
        self, ctx: RunContext[Any], *, node: AgentNode[Any], handler: WrapNodeRunHandler[Any]
    ) -> NodeResult[Any]:
        if isinstance(node, ModelRequestNode):
            self.count += 1
            if self.count > self.max_requests:
                return End(FinalResult(output='Max model requests reached'))
        return await handler(node)
```

See [Iterating Over an Agent's Graph](/docs/ai/core-concepts/agent#iterating-over-an-agents-graph) for more about the agent graph and its node types.

#### Model request hooks

Hook

Signature

Purpose

[`before_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_model_request)

`(ctx: RunContext, request_context: ModelRequestContext) -> ModelRequestContext`

Modify messages, settings, parameters, or model before the model call

[`after_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_model_request)

`(ctx: RunContext, *, request_context: ModelRequestContext, response: ModelResponse) -> ModelResponse`

Modify the model's response

[`wrap_model_request`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_model_request)

`(ctx: RunContext, *, request_context: ModelRequestContext, handler: WrapModelRequestHandler) -> ModelResponse`

Wrap the model call

[`on_model_request_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_model_request_error)

`(ctx: RunContext, *, request_context: ModelRequestContext, error: Exception) -> ModelResponse`

Handle model request errors (see [error hooks](#error-hooks))

`ModelRequestContext` bundles `model`, `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof. To swap the model for a given request, set `request_context.model` to a different [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance.

To skip the model call entirely and provide a replacement response, raise [`SkipModelRequest(response)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) from `before_model_request` or `wrap_model_request`.

#### Tool hooks

Tool processing has two phases: **validation** (parsing and validating the model's JSON arguments against the tool's schema) and **execution** (running the tool function). Each phase has its own hooks.

All tool hooks receive a `tool_def` parameter with the [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition).

**Validation hooks** -- `args` is the raw `str | dict[str, Any]` from the model before validation, or the validated `dict[str, Any]` after:

Hook

Signature

Purpose

[`before_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_tool_validate)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs) -> RawToolArgs`

Modify raw args before validation (e.g. JSON repair)

[`after_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_validate)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs) -> ValidatedToolArgs`

Modify validated args

[`wrap_tool_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_validate)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, handler: WrapToolValidateHandler) -> ValidatedToolArgs`

Wrap the validation step

[`on_tool_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_validate_error)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: RawToolArgs, error: Exception) -> ValidatedToolArgs`

Handle validation errors (see [error hooks](#error-hooks))

To skip validation and provide pre-validated args, raise [`SkipToolValidation(args)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation) from `before_tool_validate` or `wrap_tool_validate`.

**Execution hooks** -- `args` is always the validated `dict[str, Any]`:

Hook

Signature

Purpose

[`before_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_tool_execute)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs) -> ValidatedToolArgs`

Modify args before execution

[`after_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_tool_execute)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, result: Any) -> Any`

Modify execution result

[`wrap_tool_execute`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_tool_execute)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, handler: WrapToolExecuteHandler) -> Any`

Wrap execution

[`on_tool_execute_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_execute_error)

`(ctx: RunContext, *, call: ToolCallPart, tool_def: ToolDefinition, args: ValidatedToolArgs, error: Exception) -> Any`

Handle execution errors (see [error hooks](#error-hooks))

To skip execution and provide a replacement result, raise [`SkipToolExecution(result)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution) from `before_tool_execute` or `wrap_tool_execute`.

#### Output hooks

Like tool processing, [output](/docs/ai/core-concepts/output) processing has two phases: **validation** (parsing the model's raw output against the output schema) and **processing** (extracting the value and calling any [output function](/docs/ai/core-concepts/output#output-functions)). Each phase has its own hooks.

All output hooks receive an `output_context` parameter with [`OutputContext`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.OutputContext) (mode, output type, schema info, and tool call details for [tool output](/docs/ai/core-concepts/output#tool-output)).

**Validate hooks** fire only for structured output that requires parsing (prompted, native, tool, union output). They do not fire for plain text or image output. **Process hooks** fire for **all output types** including text, structured, and image output. For [tool output](/docs/ai/core-concepts/output#tool-output), only output hooks fire -- tool hooks are skipped entirely.

**Validation hooks** -- fire for structured output only; `output` is `str` (raw text) or `dict` (tool args):

Hook

Signature

Purpose

[`before_output_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_output_validate)

`(ctx, *, output_context, output: RawOutput) -> RawOutput`

Modify raw output before validation (e.g. JSON repair)

[`after_output_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_output_validate)

`(ctx, *, output_context, output: Any) -> Any`

Modify validated output

[`wrap_output_validate`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_output_validate)

`(ctx, *, output_context, output: RawOutput, handler) -> Any`

Wrap the validation step

[`on_output_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_validate_error)

`(ctx, *, output_context, output: RawOutput, error: ValidationError | ModelRetry) -> Any`

Handle validation errors (see [error hooks](#error-hooks))

**Processing hooks** -- fire for all output types; `output` is the validated/raw output. Output validators ([`@agent.output_validator`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_validator)) run inside the processing pipeline (within `wrap_output_process`), so `after_output_process` sees the fully validated result:

Hook

Signature

Purpose

[`before_output_process`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.before_output_process)

`(ctx, *, output_context, output: Any) -> Any`

Modify output before processing

[`after_output_process`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.after_output_process)

`(ctx, *, output_context, output: Any) -> Any`

Modify processed result

[`wrap_output_process`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_output_process)

`(ctx, *, output_context, output: Any, handler) -> Any`

Wrap processing

[`on_output_process_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_process_error)

`(ctx, *, output_context, output: Any, error: Exception) -> Any`

Handle processing errors (see [error hooks](#error-hooks))

Output validate and process hooks can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with a custom message -- the same pattern used in [output functions](/docs/ai/core-concepts/output#output-functions) and [output validators](/docs/ai/core-concepts/output#output-validator-functions). See [Triggering retries with `ModelRetry`](/docs/ai/core-concepts/hooks#triggering-retries-with-modelretry) for the full pattern.

#### Tool preparation

Capabilities can filter or modify which tool definitions the model sees on each step via two hooks:

-   [`prepare_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_tools) -- receives **function** tools only. Use this for filtering or modifications to tools the model can call directly.
-   [`prepare_output_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_output_tools) -- receives [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) only, with `ctx.retry`/`ctx.max_retries` reflecting the **output** retry budget (`output_retries`), matching the [output hook](#output-hooks) lifecycle.

Both hooks operate at the toolset level -- the result flows into both the model's request parameters and `ToolManager.tools`, so filtering also blocks tool execution.

prepare\_tools\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent, RunContext, ToolDefinition
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class HideDangerousTools(AbstractCapability[Any]):
    """Hides tools matching certain name prefixes from the model."""

    hidden_prefixes: tuple[str, ...] = ('delete_', 'drop_')

    async def prepare_tools(
        self, ctx: RunContext[Any], tool_defs: list[ToolDefinition]
    ) -> list[ToolDefinition]:
        return [td for td in tool_defs if not any(td.name.startswith(p) for p in self.hidden_prefixes)]


agent = Agent('openai:gpt-5.2', capabilities=[HideDangerousTools()])


@agent.tool_plain
def delete_file(path: str) -> str:
    """Delete a file."""
    return f'deleted {path}'


@agent.tool_plain
def read_file(path: str) -> str:
    """Read a file."""
    return f'contents of {path}'


result = agent.run_sync('hello')
# The model only sees `read_file`, not `delete_file`
```

For simple cases, the built-in [`PrepareTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareTools) / [`PrepareOutputTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrepareOutputTools) capabilities wrap a callable without a custom subclass -- these are also what the agent-level [`prepare_tools=`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) / [`prepare_output_tools=`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc) kwargs inject, so kwarg and capability share one execution path.

#### Event stream hook

For runs with event streaming ([`run_stream_events`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events), [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), [UI event streams](/docs/ai/integrations/ui/overview)), capabilities can observe or transform the event stream:

Hook

Signature

Purpose

[`wrap_run_event_stream`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.wrap_run_event_stream)

`(ctx: RunContext, *, stream: AsyncIterable[AgentStreamEvent]) -> AsyncIterable[AgentStreamEvent]`

Observe, filter, or transform streamed events

event\_stream\_example.py

```python
from collections.abc import AsyncIterable
from dataclasses import dataclass
from typing import Any

from pydantic_ai import AgentStreamEvent, RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.messages import (
    PartStartEvent,
    TextPart,
    ToolCallEvent,
    ToolResultEvent,
)


@dataclass
class StreamAuditor(AbstractCapability[Any]):
    """Logs tool calls and text output during streamed runs."""

    async def wrap_run_event_stream(
        self,
        ctx: RunContext[Any],
        *,
        stream: AsyncIterable[AgentStreamEvent],
    ) -> AsyncIterable[AgentStreamEvent]:
        async for event in stream:
            if isinstance(event, ToolCallEvent):
                print(f'Tool called: {event.part.tool_name}')
            elif isinstance(event, ToolResultEvent):
                print(f'Tool result: {event.part.content!r}')
            elif isinstance(event, PartStartEvent) and isinstance(event.part, TextPart):
                print(f'Text: {event.part.content!r}')
            yield event
```

Matching against [`ToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallEvent) and [`ToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolResultEvent) handles both function tool calls ([`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent) / [`FunctionToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolResultEvent)) and output tool calls ([`OutputToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolCallEvent) / [`OutputToolResultEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.OutputToolResultEvent)). Match against the specific subclass when you need to treat them differently.

Migration from `FunctionToolCallEvent` / `FunctionToolResultEvent`

For output tool calls, match `OutputToolCallEvent` / `OutputToolResultEvent` (or the shared `ToolCallEvent` / `ToolResultEvent` bases). `FunctionToolCallEvent` / `FunctionToolResultEvent` will stop firing for output tool calls in v2.

For building web UIs that transform streamed events into protocol-specific formats (like SSE), see the [UI event streams](/docs/ai/integrations/ui/overview) documentation and the [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) base class.

#### Error hooks

Each lifecycle point has an `on_*_error` hook -- the error counterpart to `after_*`. While `after_*` hooks fire on success, `on_*_error` hooks fire on failure (after `wrap_*` has had its chance to recover):

```
before_X → wrap_X(handler)
  ├─ success ─────────→ after_X (modify result)
  └─ failure → on_X_error
        ├─ re-raise ──→ (error propagates, after_X not called)
        └─ recover ───→ after_X (modify recovered result)
```

Error hooks use **raise-to-propagate, return-to-recover** semantics:

-   **Raise the original error** -- propagates the error unchanged _(default)_
-   **Raise a different exception** -- transforms the error
-   **Return a result** -- suppresses the error and uses the returned value

Hook

Fires when

Recovery type

[`on_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_run_error)

Agent run fails

Return [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult)

[`on_node_run_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_node_run_error)

Graph node fails

Return next node or [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End)

[`on_model_request_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_model_request_error)

Model request fails

Return [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse)

[`on_tool_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_validate_error)

Tool validation fails

Return validated args `dict`

[`on_tool_execute_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_tool_execute_error)

Tool execution fails

Return any tool result

[`on_output_validate_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_validate_error)

Output validation fails

Return validated output

[`on_output_process_error`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.on_output_process_error)

Output execution fails

Return any output result

error\_hooks\_example.py

```python
from dataclasses import dataclass, field
from typing import Any

from pydantic_ai import ModelRequestContext, RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.messages import ModelResponse, TextPart


@dataclass
class ErrorLogger(AbstractCapability[Any]):
    """Logs all errors that occur during agent runs."""

    errors: list[str] = field(default_factory=list)

    async def on_model_request_error(
        self, ctx: RunContext[Any], *, request_context: ModelRequestContext, error: Exception
    ) -> ModelResponse:
        self.errors.append(f'Model error: {error}')
        # Return a fallback response to recover
        return ModelResponse(parts=[TextPart(content='Service temporarily unavailable.')])

    async def on_tool_execute_error(
        self, ctx: RunContext[Any], *, call: Any, tool_def: Any, args: dict[str, Any], error: Exception
    ) -> Any:
        self.errors.append(f'Tool {call.tool_name} failed: {error}')
        raise error  # Re-raise to let the normal retry flow handle it
```

#### Deferred tool calls

Capabilities can resolve [deferred tool calls](/docs/ai/tools-toolsets/deferred-tools) -- calls that require approval, or that are executed externally -- directly from the agent run, without ending the run and waiting for a follow-up:

Hook

Signature

Purpose

[`handle_deferred_tool_calls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.handle_deferred_tool_calls)

`(ctx: RunContext, *, requests: DeferredToolRequests) -> DeferredToolResults | None`

Resolve some or all pending approval/external calls inline

Multiple capabilities can each handle a subset: dispatch accumulates results across the chain, passing only the still-unresolved requests to the next capability. Returning `None` (or a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) with no entries) declines handling. Anything still unresolved bubbles up as a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output for the caller to handle.

For application code that just needs to plug in a handler, use the dedicated [`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls) capability -- see [Resolving deferred calls with a handler](/docs/ai/tools-toolsets/deferred-tools#resolving-deferred-calls-with-a-handler).

### Wrapping capabilities

[`WrapperCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WrapperCapability) wraps another capability and delegates all methods to it -- similar to [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) for toolsets. Subclass it to override specific methods while delegating the rest:

wrapper\_capability\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import ModelRequestContext, RunContext
from pydantic_ai.capabilities import WrapperCapability


@dataclass
class AuditedCapability(WrapperCapability[Any]):
    """Wraps any capability and logs its model requests."""

    async def before_model_request(
        self, ctx: RunContext[Any], request_context: ModelRequestContext
    ) -> ModelRequestContext:
        print(f'Request from {type(self.wrapped).__name__}')
        return await super().before_model_request(ctx, request_context)
```

The built-in [`PrefixTools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.PrefixTools) is an example of a `WrapperCapability` -- it wraps another capability and prefixes its tool names.

### Per-run state isolation

By default, a capability instance is shared across all runs of an agent. If your capability accumulates mutable state that should not leak between runs, override [`for_run`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.for_run) to return a fresh instance:

per\_run\_state.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class RequestCounter(AbstractCapability[Any]):
    """Counts model requests per run."""

    count: int = 0

    async def for_run(self, ctx: RunContext[Any]) -> 'RequestCounter':
        return RequestCounter()  # fresh instance for each run

    async def before_model_request(
        self, ctx: RunContext[Any], request_context: ModelRequestContext
    ) -> ModelRequestContext:
        self.count += 1
        return request_context


counter = RequestCounter()
agent = Agent('openai:gpt-5.2', capabilities=[counter])

# The shared counter stays at 0 because for_run returns a fresh instance
agent.run_sync('first run')
agent.run_sync('second run')
print(counter.count)
#> 0
```

### Dynamically building a capability

Capabilities can be built dynamically ahead of each agent run using a function that takes the agent [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns a capability or `None`. This is useful when the capability -- its instructions, model settings, hooks, or contributed toolset -- depends on information specific to a run, like its [dependencies](/docs/ai/core-concepts/dependencies).

To register a dynamic capability, pass a function that takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) to the `capabilities` argument of the [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) constructor or `agent.run()`. Sync and async functions are both supported. The function is called once per run and the returned capability replaces it for the rest of the run, so its instructions, model settings, toolsets, builtin tools, and hooks all flow through normally.

dynamic\_capability.py

```python
from dataclasses import dataclass
from typing import Literal

from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.models.test import TestModel


@dataclass
class Skill(AbstractCapability[str]):
    """Per-user skill loaded from a database at run time."""

    name: str
    role: Literal['admin', 'guest']

    def get_instructions(self) -> str:
        return f'You can use the {self.name} skill (role: {self.role}).'


# Pretend this comes from a database keyed by user.
SKILLS = {
    'alice': Skill(name='refunds', role='admin'),
    'bob': Skill(name='lookup', role='guest'),
}


def user_skill(ctx: RunContext[str]) -> AbstractCapability[str] | None:
    return SKILLS.get(ctx.deps)


agent = Agent(TestModel(), deps_type=str, capabilities=[user_skill])

result = agent.run_sync('hi', deps='alice')
print(result.all_messages()[0].instructions)
#> You can use the refunds skill (role: admin).
```

_(This example is complete, it can be run "as is")_

To return more than one capability from a single factory, wrap them in a [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability).

Durable execution (Temporal, DBOS, Prefect)

A dynamic capability whose resolved capability contributes only instructions, model settings, builtin tools, hooks, or `prepare_tools`/`get_wrapper_toolset` (i.e. no `get_toolset()` of its own) works seamlessly with durable execution -- the factory runs in the workflow alongside the rest of the agent loop. This covers the common "load this user's skill from the database and add its instructions" pattern.

However, dynamic capabilities that contribute their own toolset via `get_toolset()` are not yet supported with durable execution. The toolset is only known at run time, so it bypasses the durable wrapper's construction-time toolset registration and would attempt I/O directly inside the workflow. As a workaround, register the toolsets statically via `Agent(toolsets=[...])` (where they get wrapped properly) and have the dynamic capability reference them indirectly -- e.g. via [`prepare_tools`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.prepare_tools) to scope which tools are visible per-run, rather than constructing the toolset inside the factory. Full support is tracked in [#5253](https://github.com/pydantic/pydantic-ai/issues/5253).

### Composition and middleware semantics

When multiple capabilities are passed to an agent, they are composed into a single [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) that follows **middleware semantics** -- the same pattern used by web frameworks like Django and Starlette:

-   **Configuration** is merged: instructions concatenate, model settings merge additively (later capabilities override earlier ones), toolsets combine, builtin tools collect.
-   **`before_*`** hooks fire in capability order (outermost to innermost): `cap1 → cap2 → cap3`.
-   **`after_*`** hooks fire in reverse order (innermost to outermost): `cap3 → cap2 → cap1`.
-   **`wrap_*`** hooks nest as middleware: `cap1` wraps `cap2` wraps `cap3` wraps the actual operation. The first capability is the **outermost** layer.
-   **`get_wrapper_toolset`** follows the same nesting: the first capability's wrapper is outermost.

This means the first capability in the list has the first and last say on the operation -- it sees the original input before any other capability, and it sees the final output after all inner capabilities have processed it.

### Ordering

By default, capabilities are composed in the order you list them. When a capability needs to be at a specific position regardless of where the user lists it, override [`get_ordering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_ordering) to return a [`CapabilityOrdering`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CapabilityOrdering):

capability\_ordering\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai.capabilities import (
    AbstractCapability,
    CapabilityOrdering,
    CombinedCapability,
)


@dataclass
class InstrumentationCapability(AbstractCapability[Any]):
    """Must wrap all other capabilities to trace everything."""

    def get_ordering(self) -> CapabilityOrdering:
        return CapabilityOrdering(position='outermost')


@dataclass
class PlainCapability(AbstractCapability[Any]):
    pass


# InstrumentationCapability ends up first regardless of list order
combined = CombinedCapability([PlainCapability(), InstrumentationCapability()])
assert type(combined.capabilities[0]) is InstrumentationCapability
```

The available constraints are:

-   **`position`** -- `'outermost'` or `'innermost'`. Places the capability in a tier before (or after) all capabilities without that position. Multiple capabilities can share a tier; original list order breaks ties within it.
-   **`wraps`** -- list of capabilities this one wraps around (is outside of). Each entry can be a capability **type** (matches all instances via `issubclass`) or a specific **instance** (matches by identity). Use when your capability needs to see the output of another: `CapabilityOrdering(wraps=[OtherCapability])`.
-   **`wrapped_by`** -- list of capabilities that wrap around this one (are outside of it). Accepts types or instances, like `wraps`. The inverse of `wraps`.
-   **`requires`** -- list of capability types that must be present. Raises [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) if any are missing. Does not imply ordering.

When constraints are declared, [`CombinedCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.CombinedCapability) topologically sorts its children at construction time, preserving user-provided order as a tiebreaker.

[`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) supports ordering via the `ordering` parameter, so you can declare ordering constraints without subclassing:

hooks\_ordering\_example.py

```python
from pydantic_ai.capabilities import CapabilityOrdering, CombinedCapability, Hooks

logging_hooks = Hooks(ordering=CapabilityOrdering(position='outermost'))
rate_limit_hooks = Hooks(ordering=CapabilityOrdering(wrapped_by=[logging_hooks]))

# logging_hooks ends up outermost; rate_limit_hooks is wrapped by it
combined = CombinedCapability([rate_limit_hooks, logging_hooks])
assert combined.capabilities[0] is logging_hooks
assert combined.capabilities[1] is rate_limit_hooks
```

## Examples

### Guardrail (PII redaction)

A guardrail is a capability that intercepts model requests or responses to enforce safety rules. Here's one that scans model responses for potential PII and redacts it:

guardrail\_example.py

```python
import re
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.messages import ModelResponse, TextPart


@dataclass
class PIIRedactionGuardrail(AbstractCapability[Any]):
    """Redacts email addresses and phone numbers from model responses."""

    async def after_model_request(
        self,
        ctx: RunContext[Any],
        *,
        request_context: ModelRequestContext,
        response: ModelResponse,
    ) -> ModelResponse:
        for part in response.parts:
            if isinstance(part, TextPart):
                # Redact email addresses
                part.content = re.sub(
                    r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',
                    '[EMAIL REDACTED]',
                    part.content,
                )
                # Redact phone numbers (simple US pattern)
                part.content = re.sub(
                    r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b',
                    '[PHONE REDACTED]',
                    part.content,
                )
        return response


agent = Agent('openai:gpt-5.2', capabilities=[PIIRedactionGuardrail()])
result = agent.run_sync("What's Jane's contact info?")
print(result.output)
#> You can reach Jane at [EMAIL REDACTED] or [PHONE REDACTED].
```

### Logging middleware

The `wrap_*` pattern is useful when you need to observe or time both the input and output of an operation. Here's a capability that logs every model request and tool call:

logging\_middleware\_example.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent, ModelRequestContext, RunContext, ToolDefinition
from pydantic_ai.capabilities import (
    AbstractCapability,
    WrapModelRequestHandler,
    WrapToolExecuteHandler,
)
from pydantic_ai.messages import ModelResponse, ToolCallPart


@dataclass
class VerboseLogging(AbstractCapability[Any]):
    """Logs model requests and tool executions."""

    async def wrap_model_request(
        self,
        ctx: RunContext[Any],
        *,
        request_context: ModelRequestContext,
        handler: WrapModelRequestHandler,
    ) -> ModelResponse:
        print(f'  Model request (step {ctx.run_step}, {len(request_context.messages)} messages)')
        #>   Model request (step 1, 1 messages)
        response = await handler(request_context)
        print(f'  Model response: {len(response.parts)} parts')
        #>   Model response: 1 parts
        return response

    async def wrap_tool_execute(
        self,
        ctx: RunContext[Any],
        *,
        call: ToolCallPart,
        tool_def: ToolDefinition,
        args: dict[str, Any],
        handler: WrapToolExecuteHandler,
    ) -> Any:
        print(f'  Tool call: {call.tool_name}({args})')
        result = await handler(args)
        print(f'  Tool result: {result!r}')
        return result


agent = Agent('openai:gpt-5.2', capabilities=[VerboseLogging()])
result = agent.run_sync('hello')
print(f'Output: {result.output}')
#> Output: Hello! How can I help you today?
```

## Pydantic AI Harness

[**Pydantic AI Harness**](/docs/ai/harness/overview) is the official capability library for Pydantic AI -- standalone capabilities like memory, guardrails, context management, and [code mode](https://github.com/pydantic/pydantic-ai-harness/tree/main/pydantic_ai_harness/code_mode) live there rather than in core. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full breakdown, or jump to the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix).

## Third-party capabilities

Capabilities are the recommended way for third-party packages to extend Pydantic AI, since they can bundle tools with hooks, instructions, and model settings. See [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem, including [third-party toolsets](/docs/ai/tools-toolsets/toolsets#third-party-toolsets) that can also be wrapped as capabilities.

### Task Management

Capabilities for task planning and progress tracking help agents organize complex work:

-   [`pydantic-ai-todo`](https://github.com/vstorm-co/pydantic-ai-todo) - `TodoCapability` with `add_todo`, `read_todos`, `write_todos`, `update_todo_status`, and `remove_todo` tools. Supports subtasks, dependencies, and PostgreSQL persistence. Also available as a lower-level `TodoToolset`.

### Context Management

Capabilities for managing long conversations help agents stay within context limits:

-   [`summarization-pydantic-ai`](https://github.com/vstorm-co/summarization-pydantic-ai) - Four capabilities for managing long conversations: `ContextManagerCapability` (real-time token tracking, auto-compression at a configurable threshold, and large tool-output truncation); `SummarizationCapability` (LLM-powered history compression); `SlidingWindowCapability` (zero-cost message trimming); `LimitWarnerCapability` (injects a finish-soon hint before hard context limits). Also available as standalone `history_processors`: `SummarizationProcessor`, `SlidingWindowProcessor`, and `LimitWarnerProcessor`.

### Multi-Agent Orchestration

Capabilities for spawning and delegating to specialized subagents help agents tackle complex, parallelizable work:

-   [`subagents-pydantic-ai`](https://github.com/vstorm-co/subagents-pydantic-ai) - `SubAgentCapability` adds tools for multi-agent delegation: `task` (spawn a subagent), `check_task`, `wait_tasks`, `list_active_tasks`, `soft_cancel_task`, `hard_cancel_task`, and `answer_subagent`. Supports sync, async, and auto execution modes, nested subagents, and runtime agent creation. Also available as a lower-level toolset via `create_subagent_toolset`.

### Guardrails & Safety

Capabilities for cost control, input/output filtering, and tool permissions help keep agents safe and within budget:

-   [`pydantic-ai-shields`](https://github.com/vstorm-co/pydantic-ai-shields) - Ready-to-use guardrail capabilities: `CostTracking` (tracks token usage and USD cost per run, raises `BudgetExceededError` on budget overrun); `ToolGuard` (block or require approval for specific tools); `InputGuard` and `OutputGuard` (custom sync or async validation functions); `PromptInjection`, `PiiDetector`, `SecretRedaction`, `BlockedKeywords`, and `NoRefusals` content shields.

### File Operations & Sandboxing

Capabilities for filesystem access and sandboxed code execution help agents work with files and run code safely:

-   [`pydantic-ai-backend`](https://github.com/vstorm-co/pydantic-ai-backend) - `ConsoleCapability` registers `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep`, and `execute` tools with a fine-grained permission system. Backends include `StateBackend` (in-memory, for testing), `LocalBackend` (real filesystem), `DockerSandbox` (isolated container execution), and `CompositeBackend` (routing across backends). Also available as a lower-level `ConsoleToolset`.

### Agent Skills

Capabilities that implement [Agent Skills](https://agentskills.io) support help agents efficiently discover and perform specific tasks:

-   [`pydantic-ai-skills`](https://github.com/DougTrajano/pydantic-ai-skills) - `SkillsCapability` implements Agent Skills support with progressive disclosure (load skills on-demand to reduce tokens). Supports filesystem and programmatic skills; compatible with [agentskills.io](https://agentskills.io).

To add your package to this page, open a pull request.

## Publishing capabilities

To make a custom capability usable in [agent specs](/docs/ai/core-concepts/agent-spec), it needs a [`get_serialization_name`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name) (defaults to the class name) and a constructor that accepts serializable arguments. The default [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) implementation calls `cls(*args, **kwargs)`, so for simple dataclasses no override is needed:

custom\_spec\_capability.py

```python
from dataclasses import dataclass
from typing import Any

from pydantic_ai import Agent, AgentSpec
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class RateLimit(AbstractCapability[Any]):
    """Limits requests per minute."""

    rpm: int = 60


# In YAML: `- RateLimit: {rpm: 30}`
# In Python:
agent = Agent.from_spec(
    AgentSpec(model='test', capabilities=[{'RateLimit': {'rpm': 30}}]),
    custom_capability_types=[RateLimit],
)
```

Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) or [`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file).

Override [`from_spec`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec) when the constructor takes types that can't be represented in YAML/JSON. The spec fields should mirror the dataclass fields, but with serializable types:

from\_spec\_override\_example.py

```python
from collections.abc import Callable
from dataclasses import dataclass, field
from typing import Any

from pydantic_ai import RunContext, ToolDefinition
from pydantic_ai.capabilities import AbstractCapability


@dataclass
class ConditionalTools(AbstractCapability[Any]):
    """Hides tools unless a condition is met."""

    condition: Callable[[RunContext[Any]], bool]  # not serializable
    hidden_tools: list[str] = field(default_factory=list)

    @classmethod
    def from_spec(cls, hidden_tools: list[str]) -> 'ConditionalTools[Any]':
        # In the spec, there's no condition callable -- always hide
        return cls(condition=lambda ctx: True, hidden_tools=hidden_tools)

    async def prepare_tools(
        self, ctx: RunContext[Any], tool_defs: list[ToolDefinition]
    ) -> list[ToolDefinition]:
        if self.condition(ctx):
            return [td for td in tool_defs if td.name not in self.hidden_tools]
        return tool_defs
```

In YAML this would be `- ConditionalTools: {hidden_tools: [dangerous_tool]}`. In Python code, the full constructor is available: `ConditionalTools(condition=my_check, hidden_tools=['dangerous_tool'])`.

See [Extensibility](/docs/ai/guides/extensibility) for packaging conventions and the broader extension ecosystem.

---

# [Dependencies](https://pydantic.dev/docs/ai/core-concepts/dependencies/)

# Dependencies

Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](/docs/ai/core-concepts/agent#system-prompts), [tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions).

Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable, easier to test, and ultimately easier to deploy in production.

## Defining Dependencies

Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), [dataclasses](https://docs.python.org/3/library/dataclasses.html#module-dataclasses) are generally a convenient container when your dependencies included multiple objects.

Here's an example of defining an agent that requires dependencies.

(**Note:** dependencies aren't actually used in this example, see [Accessing Dependencies](#accessing-dependencies) below)

unused\_dependencies.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent


@dataclass
class MyDeps:  # (1)
  api_key: str
  http_client: httpx.AsyncClient


agent = Agent(
  'openai:gpt-5.2',
  deps_type=MyDeps,  # (2)
)


async def main():
  async with httpx.AsyncClient() as client:
      deps = MyDeps('foobar', client)
      result = await agent.run(
          'Tell me a joke.',
          deps=deps,  # (3)
      )
      print(result.output)
      #> Did you hear about the toothpaste scandal? They called it Colgate.
```

Define a dataclass to hold dependencies.

Pass the dataclass type to the `deps_type` argument of the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent.

When running the agent, pass an instance of the dataclass to the `deps` parameter.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Accessing Dependencies

Dependencies are accessed through the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) type, this should be the first parameter of system prompt functions etc.

system\_prompt\_dependencies.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
  api_key: str
  http_client: httpx.AsyncClient


agent = Agent(
  'openai:gpt-5.2',
  deps_type=MyDeps,
)


@agent.system_prompt  # (1)
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:  # (2)
  response = await ctx.deps.http_client.get(  # (3)
      'https://example.com',
      headers={'Authorization': f'Bearer {ctx.deps.api_key}'},  # (4)
  )
  response.raise_for_status()
  return f'Prompt: {response.text}'


async def main():
  async with httpx.AsyncClient() as client:
      deps = MyDeps('foobar', client)
      result = await agent.run('Tell me a joke.', deps=deps)
      print(result.output)
      #> Did you hear about the toothpaste scandal? They called it Colgate.
```

[`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) may optionally be passed to a [`system_prompt`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.system_prompt) function as the only argument.

[`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error.

Access dependencies through the [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps) attribute.

Access dependencies through the [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps) attribute.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

In addition to [`.deps`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.deps), [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) provides access to the running agent via [`.agent`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.agent), which is useful when [tools](/docs/ai/tools-toolsets/tools), [hooks](/docs/ai/core-concepts/hooks), or [capabilities](/docs/ai/core-concepts/capabilities) need to read agent properties like [`name`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.name) or [`output_type`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_type).

Dependency fields can also be referenced in instructions and descriptions via [template strings](/docs/ai/core-concepts/agent-spec#template-strings) -- for example, `TemplateStr('Hello {{name}}')` renders `name` from the deps object at runtime. This is especially useful in [agent specs](/docs/ai/core-concepts/agent-spec) where callables aren't available.

### Asynchronous vs. Synchronous dependencies

[System prompt functions](/docs/ai/core-concepts/agent#system-prompts), [function tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions) are all run in the async context of an agent run.

If these functions are not coroutines (e.g. `async def`) they are called with [`run_in_executor`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor) in a thread pool. It's therefore marginally preferable to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too.

`run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies

Whether you use synchronous or asynchronous dependencies is completely independent of whether you use `run` or `run_sync` -- `run_sync` is just a wrapper around `run` and agents are always run in an async context.

Here's the same example as above, but with a synchronous dependency:

sync\_dependencies.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
  api_key: str
  http_client: httpx.Client  # (1)


agent = Agent(
  'openai:gpt-5.2',
  deps_type=MyDeps,
)


@agent.system_prompt
def get_system_prompt(ctx: RunContext[MyDeps]) -> str:  # (2)
  response = ctx.deps.http_client.get(
      'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}
  )
  response.raise_for_status()
  return f'Prompt: {response.text}'


async def main():
  deps = MyDeps('foobar', httpx.Client())
  result = await agent.run(
      'Tell me a joke.',
      deps=deps,
  )
  print(result.output)
  #> Did you hear about the toothpaste scandal? They called it Colgate.
```

Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`.

To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Full Example

As well as system prompts, dependencies can be used in [tools](/docs/ai/tools-toolsets/tools) and [output validators](/docs/ai/core-concepts/output#output-validator-functions).

full\_example.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, ModelRetry, RunContext


@dataclass
class MyDeps:
  api_key: str
  http_client: httpx.AsyncClient


agent = Agent(
  'openai:gpt-5.2',
  deps_type=MyDeps,
)


@agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
  response = await ctx.deps.http_client.get('https://example.com')
  response.raise_for_status()
  return f'Prompt: {response.text}'


@agent.tool  # (1)
async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str:
  response = await ctx.deps.http_client.get(
      'https://example.com#jokes',
      params={'subject': subject},
      headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
  )
  response.raise_for_status()
  return response.text


@agent.output_validator  # (2)
async def validate_output(ctx: RunContext[MyDeps], output: str) -> str:
  response = await ctx.deps.http_client.post(
      'https://example.com#validate',
      headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
      params={'query': output},
  )
  if response.status_code == 400:
      raise ModelRetry(f'invalid response: {response.text}')
  response.raise_for_status()
  return output


async def main():
  async with httpx.AsyncClient() as client:
      deps = MyDeps('foobar', client)
      result = await agent.run('Tell me a joke.', deps=deps)
      print(result.output)
      #> Did you hear about the toothpaste scandal? They called it Colgate.
```

To pass `RunContext` to a tool, use the [`tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator.

`RunContext` may optionally be passed to a [`output_validator`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_validator) function as the first argument.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Overriding Dependencies

When testing agents, it's useful to be able to customise dependencies.

While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent.

This is done via the [`override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) method on the agent.

joke\_app.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class MyDeps:
  api_key: str
  http_client: httpx.AsyncClient

  async def system_prompt_factory(self) -> str:  # (1)
      response = await self.http_client.get('https://example.com')
      response.raise_for_status()
      return f'Prompt: {response.text}'


joke_agent = Agent('openai:gpt-5.2', deps_type=MyDeps)


@joke_agent.system_prompt
async def get_system_prompt(ctx: RunContext[MyDeps]) -> str:
  return await ctx.deps.system_prompt_factory()  # (2)


async def application_code(prompt: str) -> str:  # (3)
  ...
  ...
  # now deep within application code we call our agent
  async with httpx.AsyncClient() as client:
      app_deps = MyDeps('foobar', client)
      result = await joke_agent.run(prompt, deps=app_deps)  # (4)
  return result.output
```

Define a method on the dependency to make the system prompt easier to customise.

Call the system prompt factory from within the system prompt function.

Application code that calls the agent, in a real application this might be an API endpoint.

Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden.

_(This example is complete, it can be run "as is")_

test\_joke\_app.py

```python
from joke_app import MyDeps, application_code, joke_agent


class TestMyDeps(MyDeps):  # (1)
  async def system_prompt_factory(self) -> str:
      return 'test prompt'


async def test_application_code():
  test_deps = TestMyDeps('test_key', None)  # (2)
  with joke_agent.override(deps=test_deps):  # (3)
      joke = await application_code('Tell me a joke.')  # (4)
  assert joke.startswith('Did you hear about the toothpaste scandal?')
```

Define a subclass of `MyDeps` in tests to customise the system prompt factory.

Create an instance of the test dependency, we don't need to pass an `http_client` here as it's not used.

Override the dependencies of the agent for the duration of the `with` block, `test_deps` will be used when the agent is run.

Now we can safely call our application code, the agent will use the overridden dependencies.

## Examples

The following examples demonstrate how to use dependencies in Pydantic AI:

-   [Weather Agent](/docs/ai/examples/weather-agent)
-   [SQL Generation](/docs/ai/examples/sql-gen)
-   [RAG](/docs/ai/examples/rag)

---

# [Direct Model Requests](https://pydantic.dev/docs/ai/core-concepts/direct/)

# Direct Model Requests

The `direct` module provides low-level methods for making imperative requests to LLMs where the only abstraction is input and output schema translation, enabling you to use all models with the same API.

These methods are thin wrappers around the [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) implementations, offering a simpler interface when you don't need the full functionality of an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent).

The following functions are available:

-   [`model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request): Make a non-streamed async request to a model
-   [`model_request_sync`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_sync): Make a non-streamed synchronous request to a model
-   [`model_request_stream`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream): Make a streamed async request to a model
-   [`model_request_stream_sync`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request_stream_sync): Make a streamed sync request to a model

## Basic Example

Here's a simple example demonstrating how to use the direct API to make a basic request:

direct\_basic.py

```python
from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync

# Make a synchronous request to the model
model_response = model_request_sync(
    'anthropic:claude-haiku-4-5',
    [ModelRequest.user_text_prompt('What is the capital of France?')]
)

print(model_response.parts[0].content)
#> The capital of France is Paris.
print(model_response.usage)
#> RequestUsage(input_tokens=56, output_tokens=7)
```

_(This example is complete, it can be run "as is")_

## Advanced Example with Tool Calling

You can also use the direct API to work with function/tool calling.

Even here we can use Pydantic to generate the JSON schema for the tool:

```python
from typing import Literal

from pydantic import BaseModel

from pydantic_ai import ModelRequest, ToolDefinition
from pydantic_ai.direct import model_request
from pydantic_ai.models import ModelRequestParameters


class Divide(BaseModel):
    """Divide two numbers."""

    numerator: float
    denominator: float
    on_inf: Literal['error', 'infinity'] = 'infinity'


async def main():
    # Make a request to the model with tool access
    model_response = await model_request(
        'openai:gpt-5-nano',
        [ModelRequest.user_text_prompt('What is 123 / 456?')],
        model_request_parameters=ModelRequestParameters(
            function_tools=[
                ToolDefinition(
                    name=Divide.__name__.lower(),
                    description=Divide.__doc__,
                    parameters_json_schema=Divide.model_json_schema(),
                )
            ],
            allow_text_output=True,  # Allow model to either use tools or respond directly
        ),
    )
    print(model_response)
    """
    ModelResponse(
        parts=[
            ToolCallPart(
                tool_name='divide',
                args={'numerator': '123', 'denominator': '456'},
                tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b',
            )
        ],
        usage=RequestUsage(input_tokens=55, output_tokens=7),
        model_name='gpt-5-nano',
        timestamp=datetime.datetime(...),
    )
    """
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## When to Use the direct API vs Agent

The direct API is ideal when:

1.  You need more direct control over model interactions
2.  You want to implement custom behavior around model requests
3.  You're building your own abstractions on top of model interactions

For most application use cases, the higher-level [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) API provides a more convenient interface with additional features such as built-in tool execution, retrying, structured output parsing, and more.

## OpenTelemetry or Logfire Instrumentation

As with [agents](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent), you can enable OpenTelemetry/Logfire instrumentation with just a few extra lines

direct\_instrumented.py

```python
import logfire

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync

logfire.configure()
logfire.instrument_pydantic_ai()

# Make a synchronous request to the model
model_response = model_request_sync(
    'anthropic:claude-haiku-4-5',
    [ModelRequest.user_text_prompt('What is the capital of France?')],
)

print(model_response.parts[0].content)
#> The capital of France is Paris.
```

_(This example is complete, it can be run "as is")_

You can also enable OpenTelemetry on a per call basis:

direct\_instrumented.py

```python
import logfire

from pydantic_ai import ModelRequest
from pydantic_ai.direct import model_request_sync

logfire.configure()

# Make a synchronous request to the model
model_response = model_request_sync(
    'anthropic:claude-haiku-4-5',
    [ModelRequest.user_text_prompt('What is the capital of France?')],
    instrument=True
)

print(model_response.parts[0].content)
#> The capital of France is Paris.
```

See [Debugging and Monitoring](/docs/ai/integrations/logfire) for more details, including how to instrument with plain OpenTelemetry without Logfire.

---

# [Hooks](https://pydantic.dev/docs/ai/core-concepts/hooks/)

# Hooks

Hooks let you intercept and modify agent behavior at every stage of a run -- model requests, tool calls, streaming events -- using simple decorators or constructor arguments. No subclassing needed.

The [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) capability is the recommended way to add [lifecycle hooks](/docs/ai/core-concepts/capabilities#hooking-into-the-lifecycle) for application-level concerns like logging, metrics, and lightweight validation. For reusable capabilities that combine hooks with tools, instructions, or model settings, subclass [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability) instead -- see [Building custom capabilities](/docs/ai/core-concepts/capabilities#building-custom-capabilities).

## Quick start

Create a [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) instance, register hooks via `@hooks.on.*` decorators, and pass it to your agent:

hooks\_decorator.py

```python
from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import Hooks

hooks = Hooks()


@hooks.on.before_model_request
async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext:
    print(f'Sending {len(request_context.messages)} messages to the model')
    #> Sending 1 messages to the model
    return request_context


agent = Agent('test', capabilities=[hooks])
result = agent.run_sync('Hello!')
print(result.output)
#> success (no tool calls)
```

## Registering hooks

### Decorator registration

The `hooks.on` namespace provides decorator methods for every lifecycle hook. Use them as bare decorators or with parameters:

```python
# Bare decorator
@hooks.on.before_model_request
async def my_hook(ctx, request_context):
    return request_context

# With parameters (timeout, tool filter)
@hooks.on.before_model_request(timeout=5.0)
async def my_timed_hook(ctx, request_context):
    return request_context
```

Multiple hooks can be registered for the same event -- they fire in registration order.

### Constructor kwargs

You can also pass hook functions directly to the [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) constructor:

hooks\_constructor.py

```python
from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import Hooks


async def log_request(ctx: RunContext[None], request_context: ModelRequestContext) -> ModelRequestContext:
    print(f'Sending {len(request_context.messages)} messages to the model')
    #> Sending 1 messages to the model
    return request_context


agent = Agent('test', capabilities=[Hooks(before_model_request=log_request)])
result = agent.run_sync('Hello!')
print(result.output)
#> success (no tool calls)
```

Both sync and async hook functions are accepted. Sync functions are automatically wrapped for async execution.

## Hook types

### Run hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_run`

`before_run=`

`before_run`

`after_run`

`after_run=`

`after_run`

`run`

`run=`

`wrap_run`

`run_error`

`run_error=`

`on_run_error`

Run hooks fire once per agent run. `wrap_run` (registered via `hooks.on.run`) wraps the entire run and supports error recovery.

### Node hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_node_run`

`before_node_run=`

`before_node_run`

`after_node_run`

`after_node_run=`

`after_node_run`

`node_run`

`node_run=`

`wrap_node_run`

`node_run_error`

`node_run_error=`

`on_node_run_error`

Node hooks fire for each graph step (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`).

Note

`wrap_node_run` hooks are called automatically by [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), and [`agent_run.next()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRun.next), but **not** when iterating with bare `async for node in agent_run:`.

### Model request hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_model_request`

`before_model_request=`

`before_model_request`

`after_model_request`

`after_model_request=`

`after_model_request`

`model_request`

`model_request=`

`wrap_model_request`

`model_request_error`

`model_request_error=`

`on_model_request_error`

Model request hooks fire around each LLM call. `ModelRequestContext` bundles `model`, `messages`, `model_settings`, and `model_request_parameters`. To swap the model for a given request, set `request_context.model` to a different [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) instance.

To skip the model call entirely, raise [`SkipModelRequest(response)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipModelRequest) from `before_model_request` or `model_request` (wrap).

### Tool validation hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_tool_validate`

`before_tool_validate=`

`before_tool_validate`

`after_tool_validate`

`after_tool_validate=`

`after_tool_validate`

`tool_validate`

`tool_validate=`

`wrap_tool_validate`

`tool_validate_error`

`tool_validate_error=`

`on_tool_validate_error`

Validation hooks fire when the model's JSON arguments are parsed and validated. All tool hooks receive `call` ([`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)) and `tool_def` ([`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition)) parameters.

Note

Tool validation and execution hooks only fire for function tools. Internal output tools (used to deliver structured output) are not user-facing and are skipped.

To skip validation, raise [`SkipToolValidation(args)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolValidation) from `before_tool_validate` or `tool_validate` (wrap).

### Tool execution hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_tool_execute`

`before_tool_execute=`

`before_tool_execute`

`after_tool_execute`

`after_tool_execute=`

`after_tool_execute`

`tool_execute`

`tool_execute=`

`wrap_tool_execute`

`tool_execute_error`

`tool_execute_error=`

`on_tool_execute_error`

Execution hooks fire when the tool function runs. `args` is always the validated `dict[str, Any]`.

To skip execution, raise [`SkipToolExecution(result)`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.SkipToolExecution) from `before_tool_execute` or `tool_execute` (wrap).

### Output validation hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_output_validate`

`before_output_validate=`

`before_output_validate`

`after_output_validate`

`after_output_validate=`

`after_output_validate`

`output_validate`

`output_validate=`

`wrap_output_validate`

`output_validate_error`

`output_validate_error=`

`on_output_validate_error`

Output validation hooks fire when structured output is parsed against the output schema. They do **not** fire for plain text or image output. All output hooks receive an `output_context` ([`OutputContext`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.OutputContext)) parameter.

Note

During streaming, output **validation** hooks fire on every partial validation attempt as well as the final result. Output **processing** hooks fire only when partial validation succeeds, and on the final result. Check `ctx.partial_output` in your hooks to distinguish partial from final results and avoid expensive work on partials.

### Output processing hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`before_output_process`

`before_output_process=`

`before_output_process`

`after_output_process`

`after_output_process=`

`after_output_process`

`output_process`

`output_process=`

`wrap_output_process`

`output_process_error`

`output_process_error=`

`on_output_process_error`

Output processing hooks fire when the output is processed -- extracting values, calling output functions, and running output validators.

See [Output hooks](/docs/ai/core-concepts/capabilities#output-hooks) for the full lifecycle, signatures, and details on how output validators interact with processing hooks.

### Tool preparation

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`prepare_tools`

`prepare_tools=`

`prepare_tools`

`prepare_output_tools`

`prepare_output_tools=`

`prepare_output_tools`

Filters or modifies tool definitions the model sees on each step.

`prepare_tools` handles **function** tools; `prepare_output_tools` handles [output tools](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) separately, with `ctx.max_retries` reflecting the **output** retry budget. Both run as `PreparedToolset` wrappers -- the result flows into the model's request _and_ `ToolManager.tools`, so filtering also blocks tool execution.

### Deferred tool call hook

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`deferred_tool_calls`

`deferred_tool_calls=`

`handle_deferred_tool_calls`

Resolves [deferred tool calls](/docs/ai/tools-toolsets/deferred-tools) (approval-required or externally-executed) inline during a run. The hook receives a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) and returns a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) (or `None` to decline). Multiple registered hooks accumulate: each receives the still-unresolved requests and can resolve some or all of them.

hooks\_deferred\_tool\_calls.py

```python
from pydantic_ai import Agent, DeferredToolRequests, DeferredToolResults, RunContext
from pydantic_ai.capabilities import Hooks

hooks = Hooks()


@hooks.on.deferred_tool_calls
async def auto_approve(
    ctx: RunContext[None], *, requests: DeferredToolRequests
) -> DeferredToolResults:
    return requests.build_results(approve_all=True)


agent = Agent('test', capabilities=[hooks])


@agent.tool_plain(requires_approval=True)
def delete_file(path: str) -> str:
    return f'File {path!r} deleted'
```

For pure application-level handler registration without other hooks, the dedicated [`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls) capability is more concise -- see [Resolving deferred calls with a handler](/docs/ai/tools-toolsets/deferred-tools#resolving-deferred-calls-with-a-handler).

### Event stream hooks

`hooks.on.`

Constructor kwarg

`AbstractCapability` method

`run_event_stream`

`run_event_stream=`

`wrap_run_event_stream`

`event`

`event=`

_(per-event convenience)_

`run_event_stream` wraps the full event stream as an async generator. `event` is a convenience -- it fires for each individual event during a streamed run:

hooks\_event.py

```python
from pydantic_ai import Agent, AgentStreamEvent, RunContext
from pydantic_ai.capabilities import Hooks

hooks = Hooks()
event_count = 0


@hooks.on.event
async def count_events(ctx: RunContext[None], event: AgentStreamEvent) -> AgentStreamEvent:
    global event_count
    event_count += 1
    return event


agent = Agent('test', capabilities=[hooks])
```

## Tool hook filtering

Tool hooks (validation and execution) support a `tools` parameter to target specific tools by name:

hooks\_tool\_filter.py

```python
from typing import Any

from pydantic_ai import Agent, RunContext, ToolDefinition
from pydantic_ai.capabilities import Hooks
from pydantic_ai.messages import ToolCallPart

hooks = Hooks()
call_log: list[str] = []


@hooks.on.before_tool_execute(tools=['send_email'])
async def audit_dangerous_tools(
    ctx: RunContext[None],
    *,
    call: ToolCallPart,
    tool_def: ToolDefinition,
    args: dict[str, Any],
) -> dict[str, Any]:
    call_log.append(f'audit: {call.tool_name}')
    return args


agent = Agent('test', capabilities=[hooks])


@agent.tool_plain
def send_email(to: str) -> str:
    return f'sent to {to}'


result = agent.run_sync('Send an email to test@example.com')
print(call_log)
#> ['audit: send_email']
```

The `tools` parameter accepts a sequence of tool names. The hook only fires for matching tools -- other tool calls pass through unaffected.

## Timeouts

Each hook supports an optional `timeout` in seconds. If the hook exceeds the timeout, a [`HookTimeoutError`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HookTimeoutError) is raised:

hooks\_timeout.py

```python
import asyncio

from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import Hooks, HookTimeoutError

hooks = Hooks()


@hooks.on.before_model_request(timeout=0.01)
async def slow_hook(
    ctx: RunContext[None], request_context: ModelRequestContext
) -> ModelRequestContext:
    await asyncio.sleep(10)  # Will be interrupted by timeout
    return request_context  # pragma: no cover


agent = Agent('test', capabilities=[hooks])
try:
    agent.run_sync('Hello')
except HookTimeoutError as e:
    print(f'Hook timed out: {e.hook_name} after {e.timeout}s')
    #> Hook timed out: before_model_request after 0.01s
```

Timeouts are set via the decorator parameter (`@hooks.on.before_model_request(timeout=5.0)`) or via the constructor when using kwargs.

## Wrap hooks

Wrap hooks let you surround an operation with setup/teardown logic. In the `hooks.on` namespace, wrap hooks drop the `wrap_` prefix -- `hooks.on.model_request` corresponds to `wrap_model_request`:

hooks\_wrap.py

```python
from pydantic_ai import Agent, ModelRequestContext, RunContext
from pydantic_ai.capabilities import Hooks, WrapModelRequestHandler
from pydantic_ai.messages import ModelResponse

hooks = Hooks()
wrap_log: list[str] = []


@hooks.on.model_request
async def log_request(
    ctx: RunContext[None], *, request_context: ModelRequestContext, handler: WrapModelRequestHandler
) -> ModelResponse:
    wrap_log.append('before')
    response = await handler(request_context)
    wrap_log.append('after')
    return response


agent = Agent('test', capabilities=[hooks])
result = agent.run_sync('Hello!')
print(wrap_log)
#> ['before', 'after']
```

## Hook ordering

When multiple hooks are registered for the same event (either on the same `Hooks` instance or across multiple capabilities):

-   **`before_*`** hooks fire in registration/capability order
-   **`after_*`** hooks fire in reverse order
-   **`wrap_*`** hooks nest as middleware -- the first registered hook is the outermost layer

See [Composition](/docs/ai/core-concepts/capabilities#composition) for details on how hooks from multiple capabilities interact.

## Error hooks

Error hooks (`*_error` in the `hooks.on` namespace, `on_*_error` on `AbstractCapability`) use **raise-to-propagate, return-to-recover** semantics:

-   **Raise the original error** -- propagates unchanged _(default)_
-   **Raise a different exception** -- transforms the error
-   **Return a result** -- suppresses the error

See [Error hooks](/docs/ai/core-concepts/capabilities#error-hooks) for the full pattern and recovery types.

## Triggering retries with `ModelRetry`

Hooks can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with a custom message -- the same exception used in [tool functions](/docs/ai/tools-toolsets/tools#model-retry) and output validators.

**Model request hooks** (`after_model_request`, `wrap_model_request`, `on_model_request_error`):

-   The retry message is sent back to the model as a [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart)
-   `after_model_request`: the original response is preserved in message history so the model can see what it said
-   `wrap_model_request`: the response is preserved only if the handler was called
-   Retries count against the agent's `output_retries` limit

**Tool hooks** (`before/after_tool_validate`, `before/after_tool_execute`, `wrap_tool_execute`, `on_tool_execute_error`):

-   Converted to tool retry prompts, same as when a tool function raises `ModelRetry`
-   Retries count against the tool's `max_retries` limit

**Output hooks** (`before/after_output_validate`, `before/after_output_process`, `wrap_output_process`, `on_output_process_error`):

-   Converted to retry prompts, same as when an output function raises `ModelRetry`
-   For tool output, retries count against the tool's `max_retries` limit
-   For text output, retries count against the agent's `output_retries` limit

`ModelRetry` from `wrap_model_request`, `wrap_tool_execute`, and `wrap_output_process` is treated as control flow -- it bypasses the corresponding `on_*_error` hook.

hooks\_model\_retry.py

```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.capabilities import Hooks
from pydantic_ai.exceptions import ModelRetry
from pydantic_ai.messages import ModelResponse
from pydantic_ai.models import ModelRequestContext

hooks = Hooks()


@hooks.on.after_model_request
async def check_response(
    ctx: RunContext[None],
    *,
    request_context: ModelRequestContext,
    response: ModelResponse,
) -> ModelResponse:
    if 'PLACEHOLDER' in str(response.parts):
        raise ModelRetry('Response contains placeholder text. Please provide real data.')
    return response


agent = Agent('test', capabilities=[hooks])
result = agent.run_sync('Hello')
print(result.output)
#> success (no tool calls)
```

## When to use `Hooks` vs `AbstractCapability`

Use [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks)

Use [`AbstractCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability)

Application-level hooks (logging, metrics)

Reusable, packaged capabilities

Quick one-off interceptors

Combined tools + hooks + instructions + settings

No configuration state needed

Complex per-run state management

Single-file scripts

Multi-agent shared behavior

---

# [Messages and chat history](https://pydantic.dev/docs/ai/core-concepts/message-history/)

# Messages and chat history

Pydantic AI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed.

### Accessing Messages from Results

After running an agent, you can access the messages exchanged during that run from the `result` object.

Both [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) (returned by [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync)) and [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) (returned by [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream)) have the following methods:

-   [`all_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, [`all_messages_json()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.all_messages_json).
-   [`new_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages): returns only the messages from the current run. There's also a variant that returns JSON bytes, [`new_messages_json()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages_json).

StreamedRunResult and complete messages

On [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult), the messages returned from these methods will only include the final result message once the stream has finished.

E.g. you've awaited one of the following coroutines:

-   [`StreamedRunResult.stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output)
-   [`StreamedRunResult.stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text)
-   [`StreamedRunResult.stream_responses()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses)
-   [`StreamedRunResult.get_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.get_output)

**Note:** The final result message will NOT be added to result messages if you use [`.stream_text(delta=True)`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) since in this case the result content is never built as one string.

Example of accessing methods on a [`RunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) :

run\_result\_messages.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.')

result = agent.run_sync('Tell me a joke.')
print(result.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.

# all messages from the run
print(result.all_messages())
"""
[
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions='Be a helpful assistant.',
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.'
            )
        ],
        usage=RequestUsage(input_tokens=55, output_tokens=12),
        model_name='gpt-5.2',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
]
"""
```

_(This example is complete, it can be run "as is")_

Example of accessing methods on a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) :

streamed\_run\_result\_messages.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.')


async def main():
    async with agent.run_stream('Tell me a joke.') as result:
        # incomplete messages before the stream finishes
        print(result.all_messages())
        """
        [
            ModelRequest(
                parts=[
                    UserPromptPart(
                        content='Tell me a joke.',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                instructions='Be a helpful assistant.',
                run_id='...',
                conversation_id='...',
            )
        ]
        """

        async for text in result.stream_text():
            print(text)
            #> Did you hear
            #> Did you hear about the toothpaste
            #> Did you hear about the toothpaste scandal? They called
            #> Did you hear about the toothpaste scandal? They called it Colgate.

        # complete messages once the stream finishes
        print(result.all_messages())
        """
        [
            ModelRequest(
                parts=[
                    UserPromptPart(
                        content='Tell me a joke.',
                        timestamp=datetime.datetime(...),
                    )
                ],
                timestamp=datetime.datetime(...),
                instructions='Be a helpful assistant.',
                run_id='...',
                conversation_id='...',
            ),
            ModelResponse(
                parts=[
                    TextPart(
                        content='Did you hear about the toothpaste scandal? They called it Colgate.'
                    )
                ],
                usage=RequestUsage(input_tokens=50, output_tokens=12),
                model_name='gpt-5.2',
                timestamp=datetime.datetime(...),
                run_id='...',
                conversation_id='...',
            ),
        ]
        """
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### Using Messages as Input for Further Agent Runs

The primary use of message histories in Pydantic AI is to maintain context across multiple agent runs.

To use existing messages in a run, pass them to the `message_history` parameter of [`Agent.run`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`Agent.run_sync`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync) or [`Agent.run_stream`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream).

If `message_history` is set and not empty, a new system prompt is not generated -- we assume the existing message history includes a system prompt. If your history comes from a source that doesn't round-trip system prompts (a UI frontend, a database that didn't persist them, a compaction pipeline), add the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability so the agent's configured `system_prompt` is reinjected at the head of the first request when it's missing.

Reusing messages in a conversation

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.')

result1 = agent.run_sync('Tell me a joke.')
print(result1.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.

result2 = agent.run_sync('Explain?', message_history=result1.new_messages())
print(result2.output)
#> This is an excellent joke invented by Samuel Colvin, it needs no explanation.

print(result2.all_messages())
"""
[
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions='Be a helpful assistant.',
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.'
            )
        ],
        usage=RequestUsage(input_tokens=55, output_tokens=12),
        model_name='gpt-5.2',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Explain?',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions='Be a helpful assistant.',
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
            )
        ],
        usage=RequestUsage(input_tokens=56, output_tokens=26),
        model_name='gpt-5.2',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
]
"""
```

_(This example is complete, it can be run "as is")_

### Correlating runs with `conversation_id`

Each `ModelRequest` and `ModelResponse` carries two identifiers:

-   [`run_id`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.run_id) -- unique per agent run; emitted on the OpenTelemetry agent run span as `gen_ai.agent.call.id`.
-   [`conversation_id`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest.conversation_id) -- shared across all runs that build on the same `message_history`; emitted as `gen_ai.conversation.id`.

A fresh `conversation_id` is generated on the first run, stamped onto every message produced by that run, and inherited by subsequent runs that pass the messages back via `message_history`. This means you can correlate traces from a multi-turn conversation in [Logfire](/docs/ai/integrations/logfire) (or any OpenTelemetry backend) without tracking anything yourself -- as long as the message history round-trips, the conversation ID does too.

conversation\_id is shared across runs in the same conversation

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result1 = agent.run_sync('Tell me a joke.')
result2 = agent.run_sync('Explain?', message_history=result1.all_messages())

assert result1.conversation_id == result2.conversation_id
```

To override or fork:

-   Pass `conversation_id='<your-id>'` to use an ID from your own application (e.g. a chat thread ID stored in your database).
-   Pass `conversation_id='new'` to start a fresh conversation that ignores any `conversation_id` already on `message_history` -- useful for branching off an existing thread without making the caller generate an ID.

forking a conversation

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

result1 = agent.run_sync('Tell me a joke.')
forked = agent.run_sync(
    'Tell me a different joke.',
    message_history=result1.all_messages(),
    conversation_id='new',
)

assert forked.conversation_id != result1.conversation_id
```

The [UI adapters](/docs/ai/integrations/ui/overview) auto-populate `conversation_id` from the protocol's own thread/chat ID, so frontends using these protocols get correlation for free.

## Storing and loading messages (to JSON)

While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases.

The intended way to do this is using a `TypeAdapter`.

We export [`ModelMessagesTypeAdapter`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessagesTypeAdapter) that can be used for this, or you can create your own.

Here's an example showing how:

serialize messages to json

```python
from pydantic_core import to_jsonable_python

from pydantic_ai import (
  Agent,
  ModelMessagesTypeAdapter,  # (1)
)

agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.')

result1 = agent.run_sync('Tell me a joke.')
history_step_1 = result1.all_messages()
as_python_objects = to_jsonable_python(history_step_1)  # (2)
same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects)

result2 = agent.run_sync(  # (3)
  'Tell me a different joke.', message_history=same_history_as_step_1
)
```

Alternatively, you can create a `TypeAdapter` from scratch:

```python
from pydantic import TypeAdapter
from pydantic_ai import ModelMessage
ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage])
```

Alternatively you can serialize to/from JSON directly:

```python
from pydantic_core import to_json
...
as_json_objects = to_json(history_step_1)
same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects)
```

You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run.

_(This example is complete, it can be run "as is")_

## Other ways of using messages

Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing.

The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models.

In the example below, we reuse the message from the first agent run, which uses the `openai:gpt-5.2` model, in a second agent run using the `google-gla:gemini-3-pro-preview` model.

Reusing messages with a different model

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.')

result1 = agent.run_sync('Tell me a joke.')
print(result1.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.

result2 = agent.run_sync(
    'Explain?',
    model='google-gla:gemini-3-pro-preview',
    message_history=result1.new_messages(),
)
print(result2.output)
#> This is an excellent joke invented by Samuel Colvin, it needs no explanation.

print(result2.all_messages())
"""
[
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Tell me a joke.',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions='Be a helpful assistant.',
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='Did you hear about the toothpaste scandal? They called it Colgate.'
            )
        ],
        usage=RequestUsage(input_tokens=55, output_tokens=12),
        model_name='gpt-5.2',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
    ModelRequest(
        parts=[
            UserPromptPart(
                content='Explain?',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions='Be a helpful assistant.',
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.'
            )
        ],
        usage=RequestUsage(input_tokens=56, output_tokens=26),
        model_name='gemini-3-pro-preview',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
]
"""
```

## Processing Message History

Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or custom processing logic.

Pydantic AI provides a `history_processors` parameter on `Agent` that allows you to intercept and modify the message history before each model request. History processors can also be provided via the [`ProcessHistory`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ProcessHistory) capability.

History processors replace the message history

History processors replace the message history in the state with the processed messages, including the new user prompt part. This means that if you want to keep the original message history, you need to make a copy of it.

History processors can affect `new_messages()` results

[`new_messages()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.new_messages) returns the messages produced during the current run. Messages provided via `message_history` are excluded -- including the trailing `ModelRequest` when resuming without a user prompt, even though the framework may stamp it with the current run's `run_id` for observability.

To keep this working when your processor mutates or adds messages:

-   If you rebuild the trailing `ModelRequest`, preserve its `parts`, `timestamp`, `instructions`, and `metadata` so it can still be identified as prior context.
-   If you insert a new message that should appear in `new_messages()`, use a [context-aware processor](#runcontext-parameter) and set `run_id=ctx.run_id` on it.

### Usage

The `history_processors` is a list of callables that take a list of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage) and return a modified list of the same type.

Each processor is applied in sequence, and processors can be either synchronous or asynchronous.

simple\_history\_processor.py

```python
from pydantic_ai import (
    Agent,
    ModelMessage,
    ModelRequest,
    ModelResponse,
    TextPart,
    UserPromptPart,
)


def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
    """Remove all ModelResponse messages, keeping only ModelRequest messages."""
    return [msg for msg in messages if isinstance(msg, ModelRequest)]

# Create agent with history processor
agent = Agent('openai:gpt-5.2', history_processors=[filter_responses])

# Example: Create some conversation history
message_history = [
    ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]),
    ModelResponse(parts=[TextPart(content='2+2 equals 4')]),  # This will be filtered out
]

# When you run the agent, the history processor will filter out ModelResponse messages
# result = agent.run_sync('What about 3+3?', message_history=message_history)
```

#### Keep Only Recent Messages

You can use the `history_processor` to only keep the recent messages:

keep\_recent\_messages.py

```python
from pydantic_ai import Agent, ModelMessage


async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
    """Keep only the last 5 messages to manage token usage."""
    return messages[-5:] if len(messages) > 5 else messages

agent = Agent('openai:gpt-5.2', history_processors=[keep_recent_messages])

# Example: Even with a long conversation history, only the last 5 messages are sent to the model
long_conversation_history: list[ModelMessage] = []  # Your long conversation history here
# result = agent.run_sync('What did we discuss?', message_history=long_conversation_history)
```

Be careful when slicing the message history

When slicing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269).

#### `RunContext` parameter

History processors can optionally accept a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) parameter to access additional information about the current run, such as dependencies, model information, and usage statistics:

context\_aware\_processor.py

```python
from pydantic_ai import Agent, ModelMessage, RunContext


def context_aware_processor(
    ctx: RunContext[None],
    messages: list[ModelMessage],
) -> list[ModelMessage]:
    # Access current usage
    current_tokens = ctx.usage.total_tokens

    # Filter messages based on context
    if current_tokens > 1000:
        return messages[-3:]  # Keep only recent messages when token usage is high
    return messages

agent = Agent('openai:gpt-5.2', history_processors=[context_aware_processor])
```

This allows for more sophisticated message processing based on the current state of the agent run.

#### Summarize Old Messages

Use an LLM to summarize older messages to preserve context while reducing tokens.

summarize\_old\_messages.py

```python
from pydantic_ai import Agent, ModelMessage

# Use a cheaper model to summarize old messages.
summarize_agent = Agent(
    'openai:gpt-5-mini',
    instructions="""
Summarize this conversation, omitting small talk and unrelated topics.
Focus on the technical discussion and next steps.
""",
)


async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
    # Summarize the oldest 10 messages
    if len(messages) > 10:
        oldest_messages = messages[:10]
        summary = await summarize_agent.run(message_history=oldest_messages)
        # Return the last message and the summary
        return summary.new_messages() + messages[-1:]

    return messages


agent = Agent('openai:gpt-5.2', history_processors=[summarize_old_messages])
```

Be careful when summarizing the message history

When summarizing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269), where you can find examples of summarizing the message history.

### Testing History Processors

You can test what messages are actually sent to the model provider using [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel):

test\_history\_processor.py

```python
import pytest

from pydantic_ai import (
    Agent,
    ModelMessage,
    ModelRequest,
    ModelResponse,
    TextPart,
    UserPromptPart,
)
from pydantic_ai.models.function import AgentInfo, FunctionModel


@pytest.fixture
def received_messages() -> list[ModelMessage]:
    return []


@pytest.fixture
def function_model(received_messages: list[ModelMessage]) -> FunctionModel:
    def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
        # Capture the messages that the provider actually receives
        received_messages.clear()
        received_messages.extend(messages)
        return ModelResponse(parts=[TextPart(content='Provider response')])

    return FunctionModel(capture_model_function)


def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]):
    def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
        return [msg for msg in messages if isinstance(msg, ModelRequest)]

    agent = Agent(function_model, history_processors=[filter_responses])

    message_history = [
        ModelRequest(parts=[UserPromptPart(content='Question 1')]),
        ModelResponse(parts=[TextPart(content='Answer 1')]),
    ]

    agent.run_sync('Question 2', message_history=message_history)
    assert received_messages == [
        ModelRequest(parts=[UserPromptPart(content='Question 1')]),
        ModelRequest(parts=[UserPromptPart(content='Question 2')]),
    ]
```

### Multiple Processors

You can also use multiple processors:

multiple\_history\_processors.py

```python
from pydantic_ai import Agent, ModelMessage, ModelRequest


def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]:
    return [msg for msg in messages if isinstance(msg, ModelRequest)]


def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]:
    return messages[-5:]


agent = Agent('openai:gpt-5.2', history_processors=[filter_responses, summarize_old_messages])
```

In this case, the `filter_responses` processor will be applied first, and the `summarize_old_messages` processor will be applied second.

## Examples

For a more complete example of using messages in conversations, see the [chat app](/docs/ai/examples/chat-app) example.

---

# [Output](https://pydantic.dev/docs/ai/core-concepts/output/)

# Output

"Output" refers to the final value returned from [running an agent](/docs/ai/core-concepts/agent#running-agents). This can be either plain text, [structured data](#structured-output), an [image](#image-output), or the result of a [function](#output-functions) called with arguments provided by the model.

The output is wrapped in [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) or [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) so that you can access other data, like [usage](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) of the run and [message history](/docs/ai/core-concepts/message-history#accessing-messages-from-results).

Both `AgentRunResult` and `StreamedRunResult` are generic in the data they wrap, so typing information about the data returned by the agent is preserved.

A run ends when the model responds with one of the output types, or, if no output type is specified or `str` is one of the allowed options, when a plain text response is received. A run can also be cancelled if usage limits are exceeded, see [Usage Limits](/docs/ai/core-concepts/agent#usage-limits).

Here's an example using a Pydantic model as the `output_type`, forcing the model to respond with data matching our specification:

olympics.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent


class CityLocation(BaseModel):
    city: str
    country: str


agent = Agent('google-gla:gemini-3-flash-preview', output_type=CityLocation)
result = agent.run_sync('Where were the olympics held in 2012?')
print(result.output)
#> city='London' country='United Kingdom'
print(result.usage())
#> RunUsage(input_tokens=57, output_tokens=8, requests=1)
```

_(This example is complete, it can be run "as is")_

## Structured output data

The [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class constructor takes an `output_type` argument that takes one or more types or [output functions](#output-functions). It supports simple scalar types, list and dict types (including `TypedDict`s and [`StructuredDict`s](#structured-dict)), dataclasses and Pydantic models, as well as type unions -- generally everything supported as type hints in a Pydantic model. You can also pass a list of multiple choices.

By default, Pydantic AI leverages the model's tool calling capability to make it return structured data. When multiple output types are specified (in a union or list), each member is registered with the model as a separate output tool in order to reduce the complexity of the schema and maximise the chances a model will respond correctly. This has been shown to work well across a wide range of models. If you'd like to change the names of the output tools, use a model's native structured output feature, or pass the output schema to the model in its [instructions](/docs/ai/core-concepts/agent#instructions), you can use an [output mode](#output-modes) marker class.

When no output type is specified, or when `str` is among the output types, any plain text response from the model will be used as the output data. If `str` is not among the output types, the model is forced to return structured data or call an output function.

If the output type schema is not of type `"object"` (e.g. it's `int` or `list[int]`), the output type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas.

Structured outputs (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model.

Type checking considerations

The Agent class is generic in its output type, and this type is carried through to `AgentRunResult.output` and `StreamedRunResult.output` so that your IDE or static type checker can warn you when your code doesn't properly take into account all the possible values those outputs could have.

Static type checkers like pyright and mypy will do their best to infer the agent's output type from the `output_type` you've specified, but they're not always able to do so correctly when you provide functions or multiple types in a union or list, even though Pydantic AI will behave correctly. When this happens, your type checker will complain even when you're confident you've passed a valid `output_type`, and you'll need to help the type checker by explicitly specifying the generic parameters on the `Agent` constructor. This is shown in the second example below and the output functions example further down.

Specifically, there are three valid uses of `output_type` where you'll need to do this:

1.  When using a union of types, e.g. `output_type=Foo | Bar`. Until [PEP-747](https://peps.python.org/pep-0747/) "Annotating Type Forms" lands in Python 3.15, type checkers do not consider these a valid value for `output_type`. In addition to the generic parameters on the `Agent` constructor, you'll need to add `# type: ignore` to the line that passes the union to `output_type`. Alternatively, you can use a list: `output_type=[Foo, Bar]`.
2.  With mypy: When using a list, as a functionally equivalent alternative to a union, or because you're passing in [output functions](#output-functions). Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19142) with mypy to try and get this fixed.
3.  With mypy: when using an async output function. Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19143) with mypy to try and get this fixed.

Here's an example of returning either text or structured data:

box\_or\_error.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent


class Box(BaseModel):
  width: int
  height: int
  depth: int
  units: str


agent = Agent(
  'openai:gpt-5-mini',
  output_type=[Box, str], # (1)
  instructions=(
      "Extract me the dimensions of a box, "
      "if you can't extract all data, ask the user to try again."
  ),
)

result = agent.run_sync('The box is 10x20x30')
print(result.output)
#> Please provide the units for the dimensions (e.g., cm, in, m).

result = agent.run_sync('The box is 10x20x30 cm')
print(result.output)
#> width=10 height=20 depth=30 units='cm'
```

This could also have been a union: `output_type=Box | str`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly.

_(This example is complete, it can be run "as is")_

Here's an example of using a union return type, which will register multiple output tools and wrap non-object schemas in an object:

colors\_or\_sizes.py

```python
from pydantic_ai import Agent

agent = Agent[None, list[str] | list[int]](
  'openai:gpt-5-mini',
  output_type=list[str] | list[int],  # type: ignore # (1)
  instructions='Extract either colors or sizes from the shapes provided.',
)

result = agent.run_sync('red square, blue circle, green triangle')
print(result.output)
#> ['red', 'blue', 'green']

result = agent.run_sync('square size 10, circle size 20, triangle size 30')
print(result.output)
#> [10, 20, 30]
```

As explained in the "Type checking considerations" section above, using a union rather than a list requires explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly.

_(This example is complete, it can be run "as is")_

### Output functions

Instead of plain text or structured data, you may want the output of your agent run to be the result of a function called with arguments provided by the model, for example to further process or validate the data provided through the arguments (with the option to tell the model to try again), or to hand off to another agent.

Output functions are similar to [function tools](/docs/ai/tools-toolsets/tools), but the model is forced to call one of them, the call ends the agent run, and the result is not passed back to the model.

As with tool functions, output function arguments provided by the model are validated using Pydantic (with optional [validation context](#validation-context)), can optionally take [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument, and can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with modified arguments (or with a different output type).

To specify output functions, you set the agent's `output_type` to either a single function (or bound instance method), or a list of functions. The list can also contain other output types like simple scalars or entire Pydantic models. You typically do not want to also register your output function as a tool (using the `@agent.tool` decorator or `tools` argument), as this could confuse the model about which it should be calling.

Here's an example of all of these features in action:

output\_functions.py

```python
import re

from pydantic import BaseModel

from pydantic_ai import Agent, ModelRetry, RunContext, UnexpectedModelBehavior


class Row(BaseModel):
    name: str
    country: str


tables = {
    'capital_cities': [
        Row(name='Amsterdam', country='Netherlands'),
        Row(name='Mexico City', country='Mexico'),
    ]
}


class SQLFailure(BaseModel):
    """An unrecoverable failure. Only use this when you can't change the query to make it work."""

    explanation: str


def run_sql_query(query: str) -> list[Row]:
    """Run a SQL query on the database."""

    select_table = re.match(r'SELECT (.+) FROM (\w+)', query)
    if select_table:
        column_names = select_table.group(1)
        if column_names != '*':
            raise ModelRetry("Only 'SELECT *' is supported, you'll have to do column filtering manually.")

        table_name = select_table.group(2)
        if table_name not in tables:
            raise ModelRetry(
                f"Unknown table '{table_name}' in query '{query}'. Available tables: {', '.join(tables.keys())}."
            )

        return tables[table_name]

    raise ModelRetry(f"Unsupported query: '{query}'.")


sql_agent = Agent[None, list[Row] | SQLFailure](
    'openai:gpt-5.2',
    output_type=[run_sql_query, SQLFailure],
    instructions='You are a SQL agent that can run SQL queries on a database.',
)


async def hand_off_to_sql_agent(ctx: RunContext, query: str) -> list[Row]:
    """I take natural language queries, turn them into SQL, and run them on a database."""

    # Drop the final message with the output tool call, as it shouldn't be passed on to the SQL agent
    messages = ctx.messages[:-1]
    try:
        result = await sql_agent.run(query, message_history=messages)
        output = result.output
        if isinstance(output, SQLFailure):
            raise ModelRetry(f'SQL agent failed: {output.explanation}')
        return output
    except UnexpectedModelBehavior as e:
        # Bubble up potentially retryable errors to the router agent
        if (cause := e.__cause__) and isinstance(cause, ModelRetry):
            raise ModelRetry(f'SQL agent failed: {cause.message}') from e
        else:
            raise


class RouterFailure(BaseModel):
    """Use me when no appropriate agent is found or the used agent failed."""

    explanation: str


router_agent = Agent[None, list[Row] | RouterFailure](
    'openai:gpt-5.2',
    output_type=[hand_off_to_sql_agent, RouterFailure],
    instructions='You are a router to other agents. Never try to solve a problem yourself, just pass it on.',
)

result = router_agent.run_sync('Select the names and countries of all capitals')
print(result.output)
"""
[
    Row(name='Amsterdam', country='Netherlands'),
    Row(name='Mexico City', country='Mexico'),
]
"""

result = router_agent.run_sync('Select all pets')
print(repr(result.output))
"""
RouterFailure(explanation="The requested table 'pets' does not exist in the database. The only available table is 'capital_cities', which does not contain data about pets.")
"""

result = router_agent.run_sync('How do I fly from Amsterdam to Mexico City?')
print(repr(result.output))
"""
RouterFailure(explanation='I am not equipped to provide travel information, such as flights from Amsterdam to Mexico City.')
"""
```

#### Text output

If you provide an output function that takes a string, Pydantic AI will by default create an output tool like for any other output function. If instead you'd like the model to provide the string using plain text output, you can wrap the function in the [`TextOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.TextOutput) marker class.

If desired, this marker class can be used alongside one or more [`ToolOutput`](#tool-output) marker classes (or unmarked types or functions) in a list provided to `output_type`.

Like other output functions, text output functions can optionally take [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as the first argument, and can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to ask the model to try again with modified arguments (or with a different output type).

text\_output\_function.py

```python
from pydantic_ai import Agent, TextOutput


def split_into_words(text: str) -> list[str]:
    return text.split()


agent = Agent(
    'openai:gpt-5.2',
    output_type=TextOutput(split_into_words),
)
result = agent.run_sync('Who was Albert Einstein?')
print(result.output)
#> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.']
```

_(This example is complete, it can be run "as is")_

#### Handling partial output in output functions

When streaming with `run_stream()` or `run_stream_sync()`, output functions are called **multiple times** -- once for each partial output received from the model, and once for the final complete output.

You should check the [`RunContext.partial_output`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.partial_output) flag when your output function has **side effects** (e.g., sending notifications, logging, database updates) that should only execute on the final output.

When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](/docs/ai/core-concepts/agent#running-agents), `partial_output` is always `False` as the function is only called once with the complete output.

output\_function\_with\_side\_effects.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext


class DatabaseRecord(BaseModel):
    name: str
    value: int | None = None  # Make optional to allow partial output


def save_to_database(ctx: RunContext, record: DatabaseRecord) -> DatabaseRecord:
    """Output function with side effect - only save final output to database."""
    if ctx.partial_output:
        # Skip side effects for partial outputs
        return record

    # Only execute side effect for the final output
    print(f'Saving to database: {record.name} = {record.value}')
    #> Saving to database: test = 42
    return record


agent = Agent('openai:gpt-5.2', output_type=save_to_database)


async def main():
    async with agent.run_stream('Create a record with name "test" and value 42') as result:
        async for output in result.stream_output(debounce_by=None):
            print(output)
            #> name='test' value=None
            #> name='test' value=42
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### Output modes

Pydantic AI implements three different methods to get a model to output structured data:

1.  [Tool Output](#tool-output), where tool calls are used to produce the output.
2.  [Native Output](#native-output), where the model is required to produce text content compliant with a provided JSON schema.
3.  [Prompted Output](#prompted-output), where a prompt is injected into the model instructions including the desired JSON schema, and we attempt to parse the model's plain-text response as appropriate.

#### Tool Output

In the default Tool Output mode, the output JSON schema of each output type (or function) is provided to the model as the parameters schema of a special output tool. This is the default as it's supported by virtually all models and has been shown to work very well.

If you'd like to change the name of the output tool, pass a custom description to aid the model, or turn on or off strict mode, you can wrap the type(s) in the [`ToolOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) marker class and provide the appropriate arguments. Note that by default, the description is taken from the docstring specified on a Pydantic model or output function, so specifying it using the marker class is typically not necessary.

When using output tools, each tool gets its own retry counter -- the agent-level `output_retries` (set on `Agent(output_retries=N)` or per-run via `agent.run(output_retries=N)`) is the _default per-tool limit_. To override the limit for an individual output tool, pass [`max_retries`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput.max_retries) on `ToolOutput`: `ToolOutput(Fruit, max_retries=2)`. See [How output retries are enforced](/docs/ai/core-concepts/agent#how-output-retries-are-enforced) for the relationship to the text-output path's global budget.

To dynamically modify or filter the available output tools during an agent run, you can define an agent-wide `prepare_output_tools` function that will be called ahead of each step of a run. This function should be of type [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc), which takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and returns a new list of tool definitions (or `None` to disable all tools for that step). This is analogous to the [`prepare_tools` function](/docs/ai/tools-toolsets/tools-advanced#prepare-tools) for non-output tools.

tool\_output.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, ToolOutput


class Fruit(BaseModel):
  name: str
  color: str


class Vehicle(BaseModel):
  name: str
  wheels: int


agent = Agent(
  'openai:gpt-5.2',
  output_type=[ # (1)
      ToolOutput(Fruit, name='return_fruit'),
      ToolOutput(Vehicle, name='return_vehicle'),
  ],
)
result = agent.run_sync('What is a banana?')
print(repr(result.output))
#> Fruit(name='banana', color='yellow')
```

If we were passing just `Fruit` and `Vehicle` without custom tool names, we could have used a union: `output_type=Fruit | Vehicle`. However, as `ToolOutput` is an object rather than a type, we have to use a list.

_(This example is complete, it can be run "as is")_

##### Parallel Output Tool Calls

When the model calls other tools in parallel with an output tool, you can control how tool calls are executed by setting the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy):

-   `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped
-   `'graceful'`: Output tools are executed first. Once a valid final result is found, remaining output tool calls are skipped, but function tools are still executed
-   `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output

Strategy

Function tools

Output tools

`'early'` (default)

Skip remaining

Skip remaining

`'graceful'`

Execute all

Skip remaining

`'exhaustive'`

Execute all

Execute all (first valid result wins)

The `'graceful'` and `'exhaustive'` strategies are useful when function tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute. Use `'graceful'` over `'exhaustive'` when you want to avoid executing additional output tools unnecessarily -- for example, when output tools have side effects that should only fire once.

Priority of output and deferred tools in streaming methods

The [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) and [`run_stream_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_sync) methods will consider the first output that matches the [output type](/docs/ai/core-concepts/output#structured-output) (which could be text, an [output tool](/docs/ai/core-concepts/output#tool-output) call, or a [deferred](/docs/ai/tools-toolsets/deferred-tools) tool call) to be the final output of the agent run, even when the model generates (additional) tool calls after this "final" output.

This means that if the model calls deferred tools before output tools when using these methods, the deferred tool calls determine the agent run's final output, while the other [run methods](/docs/ai/core-concepts/agent#running-agents) would have prioritized the tool output.

#### Native Output

Native Output mode uses a model's native "Structured Outputs" feature (aka "JSON Schema response format"), where the model is forced to only output text matching the provided JSON schema. Note that this is not supported by all models, and sometimes comes with restrictions. For example, Gemini cannot use tools at the same time as structured output, and attempting to do so will result in an error.

To use this mode, you can wrap the output type(s) in the [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient.

native\_output.py

```python
from pydantic_ai import Agent, NativeOutput

from tool_output import Fruit, Vehicle

agent = Agent(
  'openai:gpt-5.2',
  output_type=NativeOutput(
      [Fruit, Vehicle], # (1)
      name='Fruit_or_vehicle',
      description='Return a fruit or vehicle.'
  ),
)
result = agent.run_sync('What is a Ford Explorer?')
print(repr(result.output))
#> Vehicle(name='Ford Explorer', wheels=4)
```

This could also have been a union: `output_type=Fruit | Vehicle`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly.

_(This example is complete, it can be run "as is")_

#### Prompted Output

In this mode, the model is prompted to output text matching the provided JSON schema through its [instructions](/docs/ai/core-concepts/agent#instructions) and it's up to the model to interpret those instructions correctly. This is usable with all models, but is often the least reliable approach as the model is not forced to match the schema.

While we would generally suggest starting with tool or native output, in some cases this mode may result in higher quality outputs, and for models without native tool calling or structured output support it is the only option for producing structured outputs.

If the model API supports the "JSON Mode" feature (aka "JSON Object response format") to force the model to output valid JSON, this is enabled, but it's still up to the model to abide by the schema. Pydantic AI will validate the returned structured data and tell the model to try again if validation fails, but if the model is not intelligent enough this may not be sufficient.

To use this mode, you can wrap the output type(s) in the [`PromptedOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.PromptedOutput) marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. Additionally, `template` lets you specify a custom instructions template to be used instead of the [default](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile.prompted_output_template), or `template=False` to disable the schema prompt entirely.

prompted\_output.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, PromptedOutput

from tool_output import Vehicle


class Device(BaseModel):
  name: str
  kind: str


agent = Agent(
  'openai:gpt-5.2',
  output_type=PromptedOutput(
      [Vehicle, Device], # (1)
      name='Vehicle or device',
      description='Return a vehicle or device.'
  ),
)
result = agent.run_sync('What is a MacBook?')
print(repr(result.output))
#> Device(name='MacBook', kind='laptop')

agent = Agent(
  'openai:gpt-5.2',
  output_type=PromptedOutput(
      [Vehicle, Device],
      template='Gimme some JSON: {schema}'
  ),
)
result = agent.run_sync('What is a Ford Explorer?')
print(repr(result.output))
#> Vehicle(name='Ford Explorer', wheels=4)
```

This could also have been a union: `output_type=Vehicle | Device`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly.

_(This example is complete, it can be run "as is")_

### Custom JSON schema

If it's not feasible to define your desired structured output object using a Pydantic `BaseModel`, dataclass, or `TypedDict`, for example when you get a JSON schema from an external source or generate it dynamically, you can use the [`StructuredDict()`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.StructuredDict) helper function to generate a `dict[str, Any]` subclass with a JSON schema attached that Pydantic AI will pass to the model.

Note that Pydantic AI will not perform any validation of the received JSON object and it's up to the model to correctly interpret the schema and any constraints expressed in it, like required fields or integer value ranges.

The output type will be a `dict[str, Any]` and it's up to your code to defensively read from it in case the model made a mistake. You can use an [output validator](#output-validator-functions) to reflect validation errors back to the model and get it to try again.

Along with the JSON schema, you can optionally pass `name` and `description` arguments to provide additional context to the model:

```python
from pydantic_ai import Agent, StructuredDict

HumanDict = StructuredDict(
    {
        'type': 'object',
        'properties': {
            'name': {'type': 'string'},
            'age': {'type': 'integer'}
        },
        'required': ['name', 'age']
    },
    name='Human',
    description='A human with a name and age',
)

agent = Agent('openai:gpt-5.2', output_type=HumanDict)
result = agent.run_sync('Create a person')
#> {'name': 'John Doe', 'age': 30}
```

### Validation context

Some validation relies on an extra Pydantic [context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) object. You can pass such an object to an `Agent` at definition-time via its [`validation_context`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) parameter. It will be used in the validation of both structured outputs and [tool arguments](/docs/ai/tools-toolsets/tools-advanced#tool-retries).

This validation context can be either:

-   the context object itself (`Any`), used as-is to validate outputs, or
-   a function that takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns a context object (`Any`). This function will be called automatically before each validation, allowing you to build a dynamic validation context.

Don't confuse this _validation_ context with the _LLM_ context

This Pydantic validation context object is only used internally by Pydantic AI for tool arg and output validation. In particular, it is **not** included in the prompts or messages sent to the language model.

validation\_context.py

```python
from dataclasses import dataclass

from pydantic import BaseModel, ValidationInfo, field_validator

from pydantic_ai import Agent


class Value(BaseModel):
    x: int

    @field_validator('x')
    def increment_value(cls, value: int, info: ValidationInfo):
        return value + (info.context or 0)


agent = Agent(
    'google-gla:gemini-3-flash-preview',
    output_type=Value,
    validation_context=10,
)
result = agent.run_sync('Give me a value of 5.')
print(repr(result.output))  # 5 from the model + 10 from the validation context
#> Value(x=15)


@dataclass
class Deps:
    increment: int


agent = Agent(
    'google-gla:gemini-3-flash-preview',
    output_type=Value,
    deps_type=Deps,
    validation_context=lambda ctx: ctx.deps.increment,
)
result = agent.run_sync('Give me a value of 5.', deps=Deps(increment=10))
print(repr(result.output))  # 5 from the model + 10 from the validation context
#> Value(x=15)
```

_(This example is complete, it can be run "as is")_

### Output validators

Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. Pydantic AI provides a way to add validation functions via the [`agent.output_validator`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.output_validator) decorator.

Each [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) raised here consumes one unit of the run's output retry budget. The budget defaults to `1` and can be set on the agent via `Agent(output_retries=N)`, on a single run via `agent.run(output_retries=N)`, or per output tool via [`ToolOutput(max_retries=N)`](#tool-output). Inside the validator, [`ctx.max_retries`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.max_retries) reflects the limit that will actually stop you (the global budget on the text path, or the per-tool limit on the tool path) and [`ctx.retry`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.retry) is the global retry counter, so it stays consistent across output-tool switches within a single run. See [How output retries are enforced](/docs/ai/core-concepts/agent#how-output-retries-are-enforced) for the full enforcement model.

If you want to implement separate validation logic for different output types, it's recommended to use [output functions](#output-functions) instead, to save you from having to do `isinstance` checks inside the output validator. If you want the model to output plain text, do your own processing or validation, and then have the agent's final output be the result of your function, it's recommended to use an [output function](#output-functions) with the [`TextOutput` marker class](#text-output).

Here's a simplified variant of the [SQL Generation example](/docs/ai/examples/sql-gen):

sql\_gen.py

```python
from fake_database import DatabaseConn, QueryError
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry


class Success(BaseModel):
    sql_query: str


class InvalidRequest(BaseModel):
    error_message: str


Output = Success | InvalidRequest
agent = Agent[DatabaseConn, Output](
    'google-gla:gemini-3-flash-preview',
    output_type=Output,  # type: ignore
    deps_type=DatabaseConn,
    instructions='Generate PostgreSQL flavored SQL queries based on user input.',
)


@agent.output_validator
async def validate_sql(ctx: RunContext[DatabaseConn], output: Output) -> Output:
    if isinstance(output, InvalidRequest):
        return output
    try:
        await ctx.deps.execute(f'EXPLAIN {output.sql_query}')
    except QueryError as e:
        raise ModelRetry(f'Invalid query: {e}') from e
    else:
        return output


result = agent.run_sync(
    'get me users who were last active yesterday.', deps=DatabaseConn()
)
print(result.output)
#> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day'
```

_(This example is complete, it can be run "as is")_

#### Handling partial output in output validators

When streaming with `run_stream()` or `run_stream_sync()`, output validators are called **multiple times** -- once for each partial output received from the model, and once for the final complete output.

You should check the [`RunContext.partial_output`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.partial_output) flag when you want to **validate only the complete result**, not intermediate partial values.

When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](/docs/ai/core-concepts/agent#running-agents), `partial_output` is always `False` as the validator is only called once with the complete output.

partial\_validation\_streaming.py

```python
from pydantic_ai import Agent, ModelRetry, RunContext

agent = Agent('openai:gpt-5.2')


@agent.output_validator
def validate_output(ctx: RunContext, output: str) -> str:
    if ctx.partial_output:
        return output

    if len(output) < 50:
        raise ModelRetry('Output is too short.')
    return output


async def main():
    async with agent.run_stream('Write a long story about a cat') as result:
        async for message in result.stream_text():
            print(message)
            #> Once upon a
            #> Once upon a time, there was
            #> Once upon a time, there was a curious cat
            #> Once upon a time, there was a curious cat named Whiskers who
            #> Once upon a time, there was a curious cat named Whiskers who loved to explore
            #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around
            #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around him...
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Image output

Some models can generate images as part of their response, for example those that support the [Image Generation built-in tool](/docs/ai/tools-toolsets/builtin-tools#image-generation-tool) and OpenAI models using the [Code Execution built-in tool](/docs/ai/tools-toolsets/builtin-tools#code-execution-tool) when told to generate a chart.

To use the generated image as the output of the agent run, you can set `output_type` to [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage). If no image-generating built-in tool is explicitly specified, the [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) will be enabled automatically.

image\_output.py

```python
from pydantic_ai import Agent, BinaryImage

agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage)

result = agent.run_sync('Generate an image of an axolotl.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

If an agent does not need to always generate an image, you can use a union of `BinaryImage` and `str`. If the model generates both, the image will take precedence as output and the text will be available on [`ModelResponse.text`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.text):

image\_output\_union.py

```python
from pydantic_ai import Agent, BinaryImage

agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage | str)

result = agent.run_sync('Tell me a two-sentence story about an axolotl, no image please.')
print(result.output)
"""
Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes!
"""

result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.')
assert isinstance(result.output, BinaryImage)
print(result.response.text)
"""
Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes!
"""
```

## Optional output (allowing `None`)

Some agents perform their work entirely through tool calls and don't need to produce a final output -- for example, an agent that updates a record via a tool and then stops. Certain models (notably [Anthropic](/docs/ai/models/anthropic)) will return an empty response in this case, which by default causes Pydantic AI to retry until the model produces content.

To instead treat an empty response as a successful run, include `None` in the `output_type`:

optional\_output.py

```python
from pydantic_ai import Agent

agent = Agent('anthropic:claude-opus-4-6', output_type=str | None)


@agent.tool_plain
def mark_task_done(task_id: int) -> str:
    """Mark the task as done."""
    return f'Task {task_id} marked done.'


result = agent.run_sync('Mark task 1 as done, then stop without saying anything.')
print(result.output)
#> None
```

When the model returns an empty response and `None` is an allowed output type, the agent will return `None` instead of retrying. [Output validator functions](#output-validator-functions) still run with `None` as the argument, so you can raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) to reject it if needed.

`output_type=str | None` is the canonical case: it's handled as regular text output, and the **only** way the model signals `None` is by returning an empty response -- there's no output tool or structured schema involved. This mirrors how plain `str` is already treated specially as free-form text output rather than a structured tool call.

`None` is also supported in the other output modes, with an extra structured commit path in addition to (or in place of) the empty-response fallback:

-   **Bare unions including `None` that use tool mode** -- e.g. `output_type=int | None`, `output_type=[int, float, None]`, or `output_type=[ToolOutput(Foo), None]`: a dedicated `final_result_NoneType` output tool is exposed alongside the other output tools, so the model can commit to `None` through a tool call. An empty model response is still also treated as `None`, as with `str | None`.
-   **Explicit output mode markers** -- e.g. `output_type=ToolOutput(int | None)`, `output_type=NativeOutput([int, None])`, or `output_type=PromptedOutput([int, None])`: `None` is included as a branch of the structured schema the wrapper generates. The model commits by calling the tool with `null` (for `ToolOutput`) or by selecting the `NoneType` branch of the discriminated schema (for `NativeOutput`/`PromptedOutput`). An empty response is **not** accepted -- once you've opted into an explicit structured output mode, the model is expected to commit through the schema.

Note

`output_type=None` on its own is not valid -- at least one other output type must be provided alongside `None`.

Note

When using `agent.run_stream()` with an optional output type, an empty model response has no intermediate values to yield, so [`stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output) produces an empty iterator in this case. Use [`get_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.get_output) to retrieve the final `None` value instead.

## Streamed Results

There two main challenges with streamed results:

1.  Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in [pydantic/pydantic#10748](https://github.com/pydantic/pydantic/pull/10748).
2.  When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. Pydantic AI streams just enough of the response to sniff out if it's a tool call or an output, then streams the whole thing and calls tools, or returns the stream as a [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult).

Note

As the `run_stream()` method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output.

If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use [`agent.run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) ([docs](/docs/ai/core-concepts/agent#streaming-all-events)) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.iter) ([docs](/docs/ai/core-concepts/agent#streaming-all-events-and-output)) instead.

### Streaming Text

Example of streamed text output:

streamed\_hello\_world.py

```python
from pydantic_ai import Agent

agent = Agent('google-gla:gemini-3-flash-preview')  # (1)


async def main():
  async with agent.run_stream('Where does "hello world" come from?') as result:  # (2)
      async for message in result.stream_text():  # (3)
          print(message)
          #> The first known
          #> The first known use of "hello,
          #> The first known use of "hello, world" was in
          #> The first known use of "hello, world" was in a 1974 textbook
          #> The first known use of "hello, world" was in a 1974 textbook about the C
          #> The first known use of "hello, world" was in a 1974 textbook about the C programming language.
```

Streaming works with the standard [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming).

The [`Agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes.

Each item yield by [`StreamedRunResult.stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) is the complete text response, extended as new data is received.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

We can also stream text as deltas rather than the entire text in each item:

streamed\_delta\_hello\_world.py

```python
from pydantic_ai import Agent

agent = Agent('google-gla:gemini-3-flash-preview')


async def main():
  async with agent.run_stream('Where does "hello world" come from?') as result:
      async for message in result.stream_text(delta=True):  # (1)
          print(message)
          #> The first known
          #> use of "hello,
          #> world" was in
          #> a 1974 textbook
          #> about the C
          #> programming language.
```

[`stream_text`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text) will error if the response is not text.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

Output message not included in `messages`

The final output message will **NOT** be added to result messages if you use `.stream_text(delta=True)`, see [Messages and chat history](/docs/ai/core-concepts/message-history) for more information.

### Streaming Structured Output

Here's an example of streaming a user profile as it's built:

streamed\_user\_profile.py

```python
from datetime import date

from typing_extensions import NotRequired, TypedDict

from pydantic_ai import Agent


class UserProfile(TypedDict):
    name: str
    dob: NotRequired[date]
    bio: NotRequired[str]


agent = Agent(
    'openai:gpt-5.2',
    output_type=UserProfile,
    instructions='Extract a user profile from the input',
)


async def main():
    user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
    async with agent.run_stream(user_input) as result:
        async for profile in result.stream_output():
            print(profile)
            #> {'name': 'Ben'}
            #> {'name': 'Ben'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
            #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

As setting an `output_type` uses the [Tool Output](#tool-output) mode by default, this will only work if the model supports streaming tool arguments. For models that don't, like Gemini, try [Native Output](#native-output) or [Prompted Output](#prompted-output) instead.

### Streaming Model Responses

If you want fine-grained control of validation, you can use the following pattern to get the entire partial [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse):

streamed\_user\_profile.py

```python
from datetime import date

from pydantic import ValidationError
from typing_extensions import TypedDict

from pydantic_ai import Agent


class UserProfile(TypedDict, total=False):
  name: str
  dob: date
  bio: str


agent = Agent('openai:gpt-5.2', output_type=UserProfile)


async def main():
  user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
  async with agent.run_stream(user_input) as result:
      async for message, last in result.stream_responses(debounce_by=0.01):  # (1)
          try:
              profile = await result.validate_response_output(  # (2)
                  message,
                  allow_partial=not last,
              )
          except ValidationError:
              continue
          print(profile)
          #> {'name': 'Ben'}
          #> {'name': 'Ben'}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
          #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'}
```

[`stream_responses`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) streams the data as [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) objects, thus iteration can't fail with a `ValidationError`.

[`validate_response_output`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.validate_response_output) validates the data, `allow_partial=True` enables pydantic's [`experimental_allow_partial` flag on `TypeAdapter`](https://docs.pydantic.dev/latest/api/pydantic/type_adapter/#pydantic.type_adapter.TypeAdapter.validate_json).

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### Cancelling Streams

Sometimes you need to stop a streaming response before it completes: a user clicks "stop generating" in a chat UI, you've received enough data to make a decision, or you want to avoid receiving more tokens. [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) and [`iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) support explicit cancellation by closing the underlying model stream. [`run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) should be used as an async context manager so cleanup runs deterministically when you stop consuming events.

Model support

[`OutlinesModel`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel) and the deprecated `GeminiModel` do not currently support stream cancellation. The Google, xAI, and Hugging Face SDKs expose streaming only as async iterators, which limits when [`cancel()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.cancel) can interrupt an in-flight chunk read. See the [Google](/docs/ai/models/google#streaming-cancellation), [xAI](/docs/ai/models/xai#streaming-cancellation), and [Hugging Face](/docs/ai/models/huggingface#streaming-cancellation) provider docs for the recommended pattern.

#### Cleaning up `run_stream_events`

[`run_stream_events()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream_events) returns an `AgentEventStream` that should be used as an async context manager:

stream\_cancel\_run\_stream\_events.py

```python
from pydantic_ai import Agent, FinalResultEvent, PartStartEvent

agent = Agent('openai:gpt-5.2')


async def main():
  async with agent.run_stream_events('Write a long essay about Python') as stream:  # (1)
      async for event in stream:
          if isinstance(event, PartStartEvent):
              print(f'Started: {event.part!r}')
              #> Started: TextPart(content='Python is a ')
          elif isinstance(event, FinalResultEvent):
              break  # (2)
```

Use `async with` to ensure proper cleanup of the background task and HTTP connection. Iterating `run_stream_events()` directly without `async with` is deprecated.

Breaking out of the loop leaves the `async with` block, which closes the event stream and runs cleanup.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

`run_stream_events()` does not expose a `cancel()` method. If you need an explicit model-response cancellation handle, use [`run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream) or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter).

#### Cancelling `run_stream`

Call `cancel()` on the [`StreamedRunResult`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult) to cancel the stream:

stream\_cancel\_run\_stream.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')


async def main():
  async with agent.run_stream('Write a long essay about Python') as result:
      text = ''
      async for chunk in result.stream_text(delta=True):
          text += chunk
          if len(text) > 100:  # (1)
              await result.cancel()  # (2)
              break
      print(result.cancelled)  # (3)
      #> True
      print(result.response.state == 'interrupted')  # (4)
      #> True
```

Check a condition during streaming, for example whether enough text has been received.

`cancel()` tells the model provider to stop generating tokens and closes the HTTP connection when the model integration supports it.

The `cancelled` property reflects the cancellation state.

The final [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) is marked with `state='interrupted'` so that downstream code can identify incomplete responses.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

If you `break` out of `stream_text()` and then leave the surrounding `async with` block, the stream is cleaned up as the context exits. Use `cancel()` when you want to stop generation immediately instead of only stopping local consumption.

Interrupted tool calls

Cancelling or breaking out of a model response stream can leave the final [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) with incomplete tool-call arguments. Pydantic AI records the response with `state='interrupted'`, but it does not filter incomplete tool calls, synthesize tool returns, or otherwise define run-resumption behavior for those partial responses. If you are controlling the graph with [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter), stop the outer run loop as well, or check `response.state == 'interrupted'` before allowing the run to continue into tool execution.

#### Cancelling with `iter`

When using [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) for fine-grained control over the agent graph, you can cancel the `AgentStream` inside a `ModelRequestNode.stream()` context:

stream\_cancel\_iter.py

```python
from pydantic_ai import Agent, FinalResultEvent

agent = Agent('openai:gpt-5.2')


async def main():
  async with agent.iter('Write a long essay about Python') as run:
      async for node in run:
          if Agent.is_model_request_node(node):
              async with node.stream(run.ctx) as stream:
                  async for event in stream:
                      if isinstance(event, FinalResultEvent):
                          await stream.cancel()  # (1)
                          break
```

`AgentStream.cancel()` cancels the stream at the model request level.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

#### Message History After Cancellation

When a stream is cancelled, the response is recorded with `state='interrupted'` in the message history. The history includes any partial content that was received before cancellation:

stream\_cancel\_history.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')


async def main():
  async with agent.run_stream('Tell me about Python') as result:
      async for text in result.stream_text(delta=True):
          break
      await result.cancel()

  messages = result.all_messages()  # (1)
  print(messages[-1].state)  # (2)
  #> interrupted
```

The message history includes the interrupted response with any partial content that was received before cancellation.

The interrupted response state lets your application decide whether to keep, inspect, or discard the partial response before reusing the history.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

Reusing interrupted history

Pydantic AI does not clean up incomplete tool calls in interrupted responses. Passing interrupted history directly into another run can therefore fail or lead to retries if the model was in the middle of emitting a tool call when cancellation happened. For now, applications that reuse interrupted history should inspect `state='interrupted'` responses and apply their own policy.

Usage tracking for cancelled streams

Token usage reported by `usage()` after cancellation is partial and provider-dependent. Pydantic AI stops pulling from the stream immediately, so final usage events may never arrive; some provider SDKs may also continue generation server-side after the local stream is closed. Do not rely on cancelled-stream usage for cost-critical accounting. For OpenAI chat completions, [`openai_continuous_usage_stats`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModelSettings) can improve in-stream usage reporting by requesting cumulative usage data with each chunk, but cancelled-stream usage is still best-effort.

## Examples

The following examples demonstrate how to use streamed responses in Pydantic AI:

-   [Stream markdown](/docs/ai/examples/stream-markdown)
-   [Stream Whales](/docs/ai/examples/stream-whales)

---

# [Overview](https://pydantic.dev/docs/ai/evals/evals/)

# Overview

**Pydantic Evals** is a powerful evaluation framework for systematically testing and evaluating AI systems, from simple LLM calls to complex multi-agent applications.

## Design Philosophy

Code-First Approach

Pydantic Evals follows a code-first philosophy where all evaluation components are defined in Python. This differs from platforms with web-based configuration. You write and run evals in code, and can write the results to disk or view them in your terminal or in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/).

Evals are an Emerging Practice

Unlike unit tests, evals are an emerging art/science. Anyone who claims to know exactly how your evals should be defined can safely be ignored. We've designed Pydantic Evals to be flexible and useful without being too opinionated.

## Quick Navigation

**Getting Started:**

-   [Installation](#installation)
-   [Quick Start](/docs/ai/evals/quick-start)
-   [Core Concepts](/docs/ai/evals/core-concepts)

**Evaluators:**

-   [Evaluators Overview](/docs/ai/evals/evaluators/overview) - Compare evaluator types and learn when to use each approach
-   [Built-in Evaluators](/docs/ai/evals/evaluators/built-in) - Complete reference for exact match, instance checks, and other ready-to-use evaluators
-   [LLM as a Judge](/docs/ai/evals/evaluators/llm-judge) - Use LLMs to evaluate subjective qualities, complex criteria, and natural language outputs
-   [Custom Evaluators](/docs/ai/evals/evaluators/custom) - Implement domain-specific scoring logic and custom evaluation metrics
-   [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) - Evaluate internal agent behavior (tool calls, execution flow) using OpenTelemetry traces. Essential for complex agents where correctness depends on _how_ the answer was reached, not just the final output. Also ensures eval assertions align with production telemetry.

**How-To Guides:**

-   [Logfire Integration](/docs/ai/evals/how-to/logfire-integration) - Visualize results
-   [Dataset Management](/docs/ai/evals/how-to/dataset-management) - Save, load, generate
-   [Concurrency & Performance](/docs/ai/evals/how-to/concurrency) - Control parallel execution
-   [Retry Strategies](/docs/ai/evals/how-to/retry-strategies) - Handle transient failures
-   [Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes) - Track custom data
-   [Case Lifecycle Hooks](/docs/ai/evals/how-to/lifecycle) - Per-case setup, teardown, and context enrichment

**Examples:**

-   [Simple Validation](/docs/ai/evals/examples/simple-validation) - Basic example

**Reference:**

-   [API Documentation](/docs/ai/api/pydantic-evals/dataset)

## Code-First Evaluation

Pydantic Evals follows a **code-first approach** where you define all evaluation components (datasets, experiments, tasks, cases and evaluators) in Python code, or as serialized data loaded by Python code. This differs from platforms with fully web-based configuration.

When you run an _Experiment_ you'll see a progress indicator and can print the results wherever you run your python code (IDE, terminal, etc). You also get a report object back that you can serialize and store or send to a notebook or other application for further visualization and analysis.

If you are using [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/), your experiment results automatically appear in the Logfire web interface for visualization, comparison, and collaborative analysis. Logfire serves as a observability layer - you write and run evals in code, then view and analyze results in the web UI.

## Installation

To install the Pydantic Evals package, run:

-   [pip](#tab-panel-6)
-   [uv](#tab-panel-7)

Terminal

```bash
pip install pydantic-evals
```

Terminal

```bash
uv add pydantic-evals
```

`pydantic-evals` does not depend on `pydantic-ai`, but has an optional dependency on `logfire` if you'd like to use OpenTelemetry traces in your evals, or send evaluation results to [logfire](https://pydantic.dev/logfire).

-   [pip](#tab-panel-8)
-   [uv](#tab-panel-9)

Terminal

```bash
pip install 'pydantic-evals[logfire]'
```

Terminal

```bash
uv add 'pydantic-evals[logfire]'
```

## Pydantic Evals Data Model

Pydantic Evals is built around a simple data model:

### Data Model Diagram

```
Dataset (1) ──────────── (Many) Case
│                        │
│                        │
└─── (Many) Experiment ──┴─── (Many) Case results
     │
     └─── (1) Task
     │
     └─── (Many) Evaluator
```

### Key Relationships

1.  **Dataset → Cases**: One Dataset contains many Cases
2.  **Dataset → Experiments**: One Dataset can be used across many Experiments over time
3.  **Experiment → Case results**: One Experiment generates results by executing each Case
4.  **Experiment → Task**: One Experiment evaluates one defined Task
5.  **Experiment → Evaluators**: One Experiment uses multiple Evaluators. Dataset-wide Evaluators are run against all Cases, and Case-specific Evaluators against their respective Cases

### Data Flow

1.  **Dataset creation**: Define cases and evaluators in YAML/JSON, or directly in Python
2.  **Experiment execution**: Run `dataset.evaluate_sync(task_function)`
3.  **Cases run**: Each Case is executed against the Task
4.  **Evaluation**: Evaluators score the Task outputs for each Case
5.  **Results**: All Case results are collected into a summary report

A metaphor

A useful metaphor (although not perfect) is to think of evals like a **Unit Testing** framework:

-   **Cases + Evaluators** are your individual unit tests - each one defines a specific scenario you want to test, complete with inputs and expected outcomes. Just like a unit test, a case asks: _"Given this input, does my system produce the right output?"_
    
-   **Datasets** are like test suites - they are the scaffolding that holds your unit tests together. They group related cases and define shared evaluation criteria that should apply across all tests in the suite.
    
-   **Experiments** are like running your entire test suite and getting a report. When you execute `dataset.evaluate_sync(my_ai_function)`, you're running all your cases against your AI system and collecting the results - just like running `pytest` and getting a summary of passes, failures, and performance metrics.
    

The key difference from traditional unit testing is that AI systems are probabilistic. If you're type checking you'll still get a simple pass/fail, but scores for text outputs are likely qualitative and/or categorical, and more open to interpretation.

For a deeper understanding, see [Core Concepts](/docs/ai/evals/core-concepts).

## Datasets and Cases

In Pydantic Evals, everything begins with [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)s and [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)s:

-   **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)**: A collection of test Cases designed for the evaluation of a specific task or function
-   **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)**: A single test scenario corresponding to Task inputs, with optional expected outputs, metadata, and case-specific evaluators

simple\_eval\_dataset.py

```python
from pydantic_evals import Case, Dataset

case1 = Case(
    name='simple_case',
    inputs='What is the capital of France?',
    expected_output='Paris',
    metadata={'difficulty': 'easy'},
)

dataset = Dataset(name='capital_quiz', cases=[case1])
```

_(This example is complete, it can be run "as is")_

See [Dataset Management](/docs/ai/evals/how-to/dataset-management) to learn about saving, loading, and generating datasets.

## Evaluators

[`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)s analyze and score the results of your Task when tested against a Case.

These can be deterministic, code-based checks (such as testing model output format with a regex, or checking for the appearance of PII or sensitive data), or they can assess non-deterministic model outputs for qualities like accuracy, precision/recall, hallucinations, or instruction-following.

While both kinds of testing are useful in LLM systems, classical code-based tests are cheaper and easier than tests which require either human or machine review of model outputs.

Pydantic Evals includes several [built-in evaluators](/docs/ai/evals/evaluators/built-in) and allows you to define [custom evaluators](/docs/ai/evals/evaluators/custom):

simple\_eval\_evaluator.py

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.evaluators.common import IsInstance

from simple_eval_dataset import dataset

dataset.add_evaluator(IsInstance(type_name='str'))  # (1)


@dataclass
class MyEvaluator(Evaluator):
  async def evaluate(self, ctx: EvaluatorContext[str, str]) -> float:  # (2)
      if ctx.output == ctx.expected_output:
          return 1.0
      elif (
          isinstance(ctx.output, str)
          and ctx.expected_output.lower() in ctx.output.lower()
      ):
          return 0.8
      else:
          return 0.0


dataset.add_evaluator(MyEvaluator())
```

You can add built-in evaluators to a dataset using the [`add_evaluator`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.add_evaluator) method.

This custom evaluator returns a simple score based on whether the output matches the expected output.

_(This example is complete, it can be run "as is")_

Learn more:

-   [Evaluators Overview](/docs/ai/evals/evaluators/overview) - When to use different types
-   [Built-in Evaluators](/docs/ai/evals/evaluators/built-in) - Complete reference
-   [LLM Judge](/docs/ai/evals/evaluators/llm-judge) - Using LLMs as evaluators
-   [Custom Evaluators](/docs/ai/evals/evaluators/custom) - Write your own logic
-   [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) - Analyze execution traces

## Running Experiments

Performing evaluations involves running a task against all cases in a dataset, also known as running an "experiment".

Putting the above two examples together and using the more declarative `evaluators` kwarg to [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset):

simple\_eval\_complete.py

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance

case1 = Case(  # (1)
  name='simple_case',
  inputs='What is the capital of France?',
  expected_output='Paris',
  metadata={'difficulty': 'easy'},
)


class MyEvaluator(Evaluator[str, str]):
  def evaluate(self, ctx: EvaluatorContext[str, str]) -> float:
      if ctx.output == ctx.expected_output:
          return 1.0
      elif (
          isinstance(ctx.output, str)
          and ctx.expected_output.lower() in ctx.output.lower()
      ):
          return 0.8
      else:
          return 0.0


dataset = Dataset(
  name='capital_quiz',
  cases=[case1],
  evaluators=[IsInstance(type_name='str'), MyEvaluator()],  # (2)
)


async def guess_city(question: str) -> str:  # (3)
  return 'Paris'


report = dataset.evaluate_sync(guess_city)  # (4)
report.print(include_input=True, include_output=True, include_durations=False)  # (5)
"""
                            Evaluation Summary: guess_city
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Case ID     ┃ Inputs                         ┃ Outputs ┃ Scores            ┃ Assertions ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ simple_case │ What is the capital of France? │ Paris   │ MyEvaluator: 1.00 │ ✔          │
├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤
│ Averages    │                                │         │ MyEvaluator: 1.00 │ 100.0% ✔   │
└─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘
"""
```

Create a [test case](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) as above

Create a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) with test cases and [`evaluators`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluators)

Our function to evaluate.

Run the evaluation with [`evaluate_sync`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync), which runs the function against all test cases in the dataset, and returns an [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) object.

Print the report with [`print`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport.print), which shows the results of the evaluation. We have omitted duration here just to keep the printed output from changing from run to run.

_(This example is complete, it can be run "as is")_

See [Quick Start](/docs/ai/evals/quick-start) for more examples and [Concurrency & Performance](/docs/ai/evals/how-to/concurrency) to learn about controlling parallel execution.

## API Reference

For comprehensive coverage of all classes, methods, and configuration options, see the detailed [API Reference documentation](https://ai.pydantic.dev/api/pydantic_evals/dataset/).

## Next Steps

1.  **Start with simple evaluations** using [Quick Start](/docs/ai/evals/quick-start)
2.  **Understand the data model** with [Core Concepts](/docs/ai/evals/core-concepts)
3.  **Explore built-in evaluators** in [Built-in Evaluators](/docs/ai/evals/evaluators/built-in)
4.  **Integrate with Logfire** for visualization: [Logfire Integration](/docs/ai/evals/how-to/logfire-integration)
5.  **Build comprehensive test suites** with [Dataset Management](/docs/ai/evals/how-to/dataset-management)
6.  **Implement custom evaluators** for domain-specific metrics: [Custom Evaluators](/docs/ai/evals/evaluators/custom)

---

# [Built-in Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/built-in/)

# Built-in Evaluators

Pydantic Evals provides several built-in evaluators for common evaluation tasks.

## Comparison Evaluators

### EqualsExpected

Check if the output exactly equals the expected output from the case.

```python
from pydantic_evals.evaluators import EqualsExpected

EqualsExpected()
```

**Parameters:** None

**Returns:** `bool` - `True` if `ctx.output == ctx.expected_output`

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

dataset = Dataset(
    name='equals_expected_demo',
    cases=[
        Case(
            name='addition',
            inputs='2 + 2',
            expected_output='4',
        ),
    ],
    evaluators=[EqualsExpected()],
)
```

**Notes:**

-   Skips evaluation if `expected_output` is `None` (returns empty dict `{}`)
-   Uses Python's `==` operator, so works with any comparable types
-   For structured data, considers nested equality

* * *

### Equals

Check if the output equals a specific value.

```python
from pydantic_evals.evaluators import Equals

Equals(value='expected_result')
```

**Parameters:**

-   `value` (Any): The value to compare against
-   `evaluation_name` (str | None): Custom name for this evaluation in reports

**Returns:** `bool` - `True` if `ctx.output == value`

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Equals

# Check output is always "success"
dataset = Dataset(
    name='equals_demo',
    cases=[Case(inputs='test')],
    evaluators=[
        Equals(value='success', evaluation_name='is_success'),
    ],
)
```

**Use Cases:**

-   Checking for sentinel values
-   Validating consistent outputs
-   Testing classification into specific categories

* * *

### Contains

Check if the output contains a specific value or substring.

```python
from pydantic_evals.evaluators import Contains

Contains(
    value='substring',
    case_sensitive=True,
    as_strings=False,
)
```

**Parameters:**

-   `value` (Any): The value to search for
-   `case_sensitive` (bool): Case-sensitive comparison for strings (default: `True`)
-   `as_strings` (bool): Convert both values to strings before checking (default: `False`)
-   `evaluation_name` (str | None): Custom name for this evaluation in reports

**Returns:** [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) - Pass/fail with explanation

**Behavior:**

For **strings**: checks substring containment

-   `Contains(value='hello', case_sensitive=False)`
    -   Matches: "Hello World", "say hello", "HELLO"
    -   Doesn't match: "hi there"

For **lists/tuples**: checks membership

-   `Contains(value='apple')`
    -   Matches: `['apple', 'banana']`, `('apple',)`
    -   Doesn't match: `['apples', 'orange']`

For **dicts**: checks key-value pairs

-   `Contains(value={'name': 'Alice'})`
    -   Matches: `{'name': 'Alice', 'age': 30}`
    -   Doesn't match: `{'name': 'Bob'}`

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Contains

dataset = Dataset(
    name='contains_demo',
    cases=[Case(inputs='test')],
    evaluators=[
        # Check for required keywords
        Contains(value='terms and conditions', case_sensitive=False),
        # Check for PII (fail if found)
        # Note: Use a custom evaluator that returns False when PII found
    ],
)
```

**Use Cases:**

-   Required content verification
-   Keyword detection
-   PII/sensitive data detection
-   Multi-value validation

* * *

## Type Validation

### IsInstance

Check if the output is an instance of a type with the given name.

```python
from pydantic_evals.evaluators import IsInstance

IsInstance(type_name='str')
```

**Parameters:**

-   `type_name` (str): The type name to check (uses `__name__` or `__qualname__`)
-   `evaluation_name` (str | None): Custom name for this evaluation in reports

**Returns:** [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) - Pass/fail with type information

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import IsInstance

dataset = Dataset(
    name='isinstance_demo',
    cases=[Case(inputs='test')],
    evaluators=[
        # Check output is always a string
        IsInstance(type_name='str'),
        # Check for Pydantic model
        IsInstance(type_name='MyModel'),
        # Check for dict
        IsInstance(type_name='dict'),
    ],
)
```

**Notes:**

-   Matches against both `__name__` and `__qualname__` of the type
-   Works with built-in types (`str`, `int`, `dict`, `list`, etc.)
-   Works with custom classes and Pydantic models
-   Checks the entire MRO (Method Resolution Order) for inheritance

**Use Cases:**

-   Format validation
-   Structured output verification
-   Type consistency checks

* * *

## Performance Evaluation

### MaxDuration

Check if task execution time is under a maximum threshold.

```python
from datetime import timedelta

from pydantic_evals.evaluators import MaxDuration

MaxDuration(seconds=2.0)
# or
MaxDuration(seconds=timedelta(seconds=2))
```

**Parameters:**

-   `seconds` (float | timedelta): Maximum allowed duration

**Returns:** `bool` - `True` if `ctx.duration <= seconds`

**Example:**

```python
from datetime import timedelta

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import MaxDuration

dataset = Dataset(
    name='max_duration_demo',
    cases=[Case(inputs='test')],
    evaluators=[
        # SLA: must respond in under 2 seconds
        MaxDuration(seconds=2.0),
        # Or using timedelta
        MaxDuration(seconds=timedelta(milliseconds=500)),
    ],
)
```

**Use Cases:**

-   SLA compliance
-   Performance regression testing
-   Latency requirements
-   Timeout validation

**See Also:** [Concurrency & Performance](/docs/ai/evals/how-to/concurrency)

* * *

## LLM-as-a-Judge

### LLMJudge

Use an LLM to evaluate subjective qualities based on a rubric.

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='Response is accurate and helpful',
    model='openai:gpt-5.2',
    include_input=False,
    include_expected_output=False,
    model_settings=None,
    score=False,
    assertion={'include_reason': True},
)
```

**Parameters:**

-   `rubric` (str): The evaluation criteria (required)
-   `model` (Model | KnownModelName | None): Model to use (default: `'openai:gpt-5.2'`)
-   `include_input` (bool): Include task inputs in the prompt (default: `False`)
-   `include_expected_output` (bool): Include expected output in the prompt (default: `False`)
-   `model_settings` (ModelSettings | None): Custom model settings
-   `score` (OutputConfig | False): Configure score output (default: `False`)
-   `assertion` (OutputConfig | False): Configure assertion output (default: includes reason)

**Returns:** Depends on `score` and `assertion` parameters (see below)

**Output Modes:**

By default, returns a **boolean assertion** with reason:

-   `LLMJudge(rubric='Response is polite')`
    -   Returns: `{'LLMJudge_pass': EvaluationReason(value=True, reason='...')}`

Return a **score** (0.0 to 1.0) instead:

-   `LLMJudge(rubric='Response quality', score={'include_reason': True}, assertion=False)`
    -   Returns: `{'LLMJudge_score': EvaluationReason(value=0.85, reason='...')}`

Return **both** score and assertion:

-   `LLMJudge(rubric='Response quality', score={'include_reason': True}, assertion={'include_reason': True})`
    -   Returns: `{'LLMJudge_score': EvaluationReason(value=0.85, reason='...'), 'LLMJudge_pass': EvaluationReason(value=True, reason='...')}`

**Customize evaluation names:**

-   `LLMJudge(rubric='Response is factually accurate', assertion={'evaluation_name': 'accuracy', 'include_reason': True})`
    -   Returns: `{'accuracy': EvaluationReason(value=True, reason='...')}`

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='llm_judge_demo',
    cases=[Case(inputs='test', expected_output='result')],
    evaluators=[
        # Basic accuracy check
        LLMJudge(
            rubric='Response is factually accurate',
            include_input=True,
        ),
        # Quality score with different model
        LLMJudge(
            rubric='Overall response quality',
            model='anthropic:claude-sonnet-4-6',
            score={'evaluation_name': 'quality', 'include_reason': False},
            assertion=False,
        ),
        # Check against expected output
        LLMJudge(
            rubric='Response matches the expected answer semantically',
            include_input=True,
            include_expected_output=True,
        ),
    ],
)
```

**See Also:** [LLM Judge Deep Dive](/docs/ai/evals/evaluators/llm-judge)

* * *

## Span-Based Evaluation

### HasMatchingSpan

Check if OpenTelemetry spans match a query (requires Logfire configuration).

```python
from pydantic_evals.evaluators import HasMatchingSpan

HasMatchingSpan(
    query={'name_contains': 'tool_call'},
    evaluation_name='called_tool',
)
```

**Parameters:**

-   `query` ([`SpanQuery`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanQuery)): Query to match against spans
-   `evaluation_name` (str | None): Custom name for this evaluation in reports

**Returns:** `bool` - `True` if any span matches the query

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import HasMatchingSpan

dataset = Dataset(
    name='span_check_demo',
    cases=[Case(inputs='test')],
    evaluators=[
        # Check that a specific tool was called
        HasMatchingSpan(
            query={'name_contains': 'search_database'},
            evaluation_name='used_database',
        ),
        # Check for errors
        HasMatchingSpan(
            query={'has_attributes': {'error': True}},
            evaluation_name='had_errors',
        ),
        # Check duration constraints
        HasMatchingSpan(
            query={
                'name_equals': 'llm_call',
                'max_duration': 2.0,  # seconds
            },
            evaluation_name='llm_fast_enough',
        ),
    ],
)
```

**See Also:** [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)

* * *

## Built-in Report Evaluators

In addition to the case-level evaluators above, Pydantic Evals provides report evaluators that analyze entire experiment results. These are passed via the `report_evaluators` parameter on `Dataset`.

Report Evaluator

Purpose

Output

[`ConfusionMatrixEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ConfusionMatrixEvaluator)

Classification confusion matrix

`ConfusionMatrix`

[`PrecisionRecallEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.PrecisionRecallEvaluator)

PR curve with AUC

`PrecisionRecall`

**See:** [Report Evaluators](/docs/ai/evals/evaluators/report-evaluators) for full documentation, parameters, and examples, including how to write custom report evaluators that produce `ScalarResult` and `TableResult` analyses.

* * *

## Quick Reference Table

### Case-Level Evaluators

Evaluator

Purpose

Return Type

Cost

Speed

[`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected)

Exact match with expected

`bool`

Free

Instant

[`Equals`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Equals)

Equals specific value

`bool`

Free

Instant

[`Contains`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Contains)

Contains value/substring

`bool` + reason

Free

Instant

[`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance)

Type validation

`bool` + reason

Free

Instant

[`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration)

Performance threshold

`bool`

Free

Instant

[`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge)

Subjective quality

`bool` and/or `float`

$$

Slow

[`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan)

Behavioral check

`bool`

Free

Fast

### Report-Level Evaluators

Evaluator

Purpose

Output Type

Cost

Speed

[`ConfusionMatrixEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ConfusionMatrixEvaluator)

Classification matrix

`ConfusionMatrix`

Free

Instant

[`PrecisionRecallEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.PrecisionRecallEvaluator)

PR curve with AUC

`PrecisionRecall`

Free

Instant

## Combining Evaluators

Best practice is to combine fast deterministic checks with slower LLM evaluations:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    Contains,
    IsInstance,
    LLMJudge,
    MaxDuration,
)

dataset = Dataset(
    name='combined_evaluators',
    cases=[Case(inputs='test')],
    evaluators=[
        # Fast checks first (fail fast)
        IsInstance(type_name='str'),
        Contains(value='required_field'),
        MaxDuration(seconds=2.0),
        # Expensive LLM checks last
        LLMJudge(rubric='Response is helpful and accurate'),
    ],
)
```

This approach:

1.  Catches format/structure issues immediately
2.  Validates required content quickly
3.  Only runs expensive LLM evaluation if basic checks pass
4.  Provides comprehensive quality assessment

## Next Steps

-   **[LLM Judge](/docs/ai/evals/evaluators/llm-judge)** - Deep dive on LLM-as-a-Judge evaluation
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic
-   **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, etc.)
-   **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Using OpenTelemetry spans for behavioral checks

---

# [Custom Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/custom/)

# Custom Evaluators

Write custom evaluators for domain-specific logic, external integrations, or specialized metrics.

## Basic Custom Evaluator

All evaluators inherit from [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) and must implement `evaluate`:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    """Check if output exactly matches expected output."""

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.output == ctx.expected_output
```

**Key Points:**

-   Use `@dataclass` decorator (required)
-   Inherit from `Evaluator`
-   Implement `evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput`
-   Return `bool`, `int`, `float`, `str`, [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason), or `dict` of these

## EvaluatorContext

The context provides all information about the case execution:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class MyEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        # Access case data
        ctx.name              # Case name
        ctx.inputs            # Task inputs
        ctx.metadata          # Case metadata
        ctx.expected_output   # Expected output (may be None)
        ctx.output            # Actual output

        # Performance data
        ctx.duration          # Task execution time (seconds)

        # Custom metrics/attributes (see metrics guide)
        ctx.metrics           # dict[str, int | float]
        ctx.attributes        # dict[str, Any]

        # OpenTelemetry spans (if logfire configured)
        ctx.span_tree         # SpanTree for behavioral checks

        return True
```

## Evaluator Parameters

Add configurable parameters as dataclass fields:

```python
from dataclasses import dataclass

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ContainsKeyword(Evaluator):
    keyword: str
    case_sensitive: bool = True

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        output = ctx.output
        keyword = self.keyword

        if not self.case_sensitive:
            output = output.lower()
            keyword = keyword.lower()

        return keyword in output


# Usage
dataset = Dataset(
    name='keyword_check',
    cases=[Case(name='test', inputs='This is important')],
    evaluators=[
        ContainsKeyword(keyword='important', case_sensitive=False),
    ],
)
```

## Return Types

### Boolean Assertions

Simple pass/fail checks:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class IsValidJSON(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        try:
            import json
            json.loads(ctx.output)
            return True
        except Exception:
            return False
```

### Numeric Scores

Quality metrics:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class LengthScore(Evaluator):
    """Score based on output length (0.0 = too short, 1.0 = ideal)."""

    ideal_length: int = 100
    tolerance: int = 20

    def evaluate(self, ctx: EvaluatorContext) -> float:
        length = len(ctx.output)
        diff = abs(length - self.ideal_length)

        if diff <= self.tolerance:
            return 1.0
        else:
            # Decay score as we move away from ideal
            score = max(0.0, 1.0 - (diff - self.tolerance) / self.ideal_length)
            return score
```

### String Labels

Categorical classifications:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class SentimentClassifier(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> str:
        output_lower = ctx.output.lower()

        if any(word in output_lower for word in ['error', 'failed', 'wrong']):
            return 'negative'
        elif any(word in output_lower for word in ['success', 'correct', 'great']):
            return 'positive'
        else:
            return 'neutral'
```

### With Reasons

Add explanations to any result:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class SmartCheck(Evaluator):
    threshold: float = 0.8

    def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        score = self._calculate_score(ctx.output)

        if score >= self.threshold:
            return EvaluationReason(
                value=True,
                reason=f'Score {score:.2f} exceeds threshold {self.threshold}',
            )
        else:
            return EvaluationReason(
                value=False,
                reason=f'Score {score:.2f} below threshold {self.threshold}',
            )

    def _calculate_score(self, output: str) -> float:
        # Your scoring logic
        return 0.75
```

### Multiple Results

You can return multiple evaluations from one evaluator by returning a dictionary of key-value pairs.

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationReason,
    Evaluator,
    EvaluatorContext,
    EvaluatorOutput,
)


@dataclass
class ComprehensiveCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput:
        format_valid = self._check_format(ctx.output)

        return {
            'valid_format': EvaluationReason(
                value=format_valid,
                reason='Valid JSON format' if format_valid else 'Invalid JSON format',
            ),
            'quality_score': self._score_quality(ctx.output),  # float
            'category': self._classify(ctx.output),  # str
        }

    def _check_format(self, output: str) -> bool:
        return output.startswith('{') and output.endswith('}')

    def _score_quality(self, output: str) -> float:
        return len(output) / 100.0

    def _classify(self, output: str) -> str:
        return 'short' if len(output) < 50 else 'long'
```

Each key in the returned dictionary becomes a separate result in the report. Values can be:

-   Primitives (`bool`, `int`, `float`, `str`)
-   [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason) (value with explanation)
-   Nested dicts of these types

The [`EvaluatorOutput`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorOutput) type represents all legal values that can be returned by an evaluator, and can be used as the return type annotation for your custom `evaluate` method.

### Conditional Results

Evaluators can dynamically choose whether to produce results for a given case by returning an empty dict when not applicable:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationReason,
    Evaluator,
    EvaluatorContext,
    EvaluatorOutput,
)


@dataclass
class SQLValidator(Evaluator):
    """Only evaluates SQL queries, skips other outputs."""

    def evaluate(self, ctx: EvaluatorContext) -> EvaluatorOutput:
        # Check if this case is relevant for SQL validation
        if not isinstance(ctx.output, str) or not ctx.output.strip().upper().startswith(
            ('SELECT', 'INSERT', 'UPDATE', 'DELETE')
        ):
            # Return empty dict - this evaluator doesn't apply to this case
            return {}

        # This is a SQL query, perform validation
        try:
            # In real implementation, use sqlparse or similar
            is_valid = self._validate_sql(ctx.output)
            return {
                'sql_valid': is_valid,
                'sql_complexity': self._measure_complexity(ctx.output),
            }
        except Exception as e:
            return {'sql_valid': EvaluationReason(False, reason=f'Exception: {e}')}

    def _validate_sql(self, query: str) -> bool:
        # Simplified validation
        return 'FROM' in query.upper() or 'INTO' in query.upper()

    def _measure_complexity(self, query: str) -> str:
        joins = query.upper().count('JOIN')
        if joins == 0:
            return 'simple'
        elif joins <= 2:
            return 'moderate'
        else:
            return 'complex'
```

This pattern is useful when:

-   An evaluator only applies to certain types of outputs (e.g., code validation only for code outputs)
-   Validation depends on metadata tags (e.g., only evaluate cases marked with `language='python'`)
-   You want to run expensive checks conditionally based on other evaluator results

**Key Points:**

-   Returning `{}` means "this evaluator doesn't apply here" - the case won't show results from this evaluator
-   Returning `{'key': value}` means "this evaluator applies and here are the results"
-   This is more practical than using case-level evaluators when it applies to a large fraction of cases, or when the condition is based on the output itself
-   The evaluator still runs for every case, but can short-circuit when not relevant

## Async Evaluators

Use `async def` for I/O-bound operations:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class APIValidator(Evaluator):
    api_url: str

    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        import httpx

        async with httpx.AsyncClient() as client:
            response = await client.post(
                self.api_url,
                json={'output': ctx.output},
            )
            return response.json()['valid']
```

Pydantic Evals handles both sync and async evaluators automatically.

## Using Metadata

Access case metadata for context-aware evaluation:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class DifficultyAwareScore(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> float:
        # Base score
        base_score = self._score_output(ctx.output)

        # Adjust based on difficulty from metadata
        if ctx.metadata and 'difficulty' in ctx.metadata:
            difficulty = ctx.metadata['difficulty']

            if difficulty == 'easy':
                # Penalize mistakes more on easy questions
                return base_score
            elif difficulty == 'hard':
                # Be more lenient on hard questions
                return min(1.0, base_score * 1.2)

        return base_score

    def _score_output(self, output: str) -> float:
        # Your scoring logic
        return 0.8
```

## Using Metrics

Access custom metrics set during task execution:

```python
from dataclasses import dataclass

from pydantic_evals import increment_eval_metric, set_eval_attribute
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


# In your task
def my_task(inputs: str) -> str:
    result = f'processed: {inputs}'

    # Record metrics
    increment_eval_metric('api_calls', 3)
    set_eval_attribute('used_cache', True)

    return result


# In your evaluator
@dataclass
class EfficiencyCheck(Evaluator):
    max_api_calls: int = 5

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        api_calls = ctx.metrics.get('api_calls', 0)
        return api_calls <= self.max_api_calls
```

See [Metrics & Attributes Guide](/docs/ai/evals/how-to/metrics-attributes) for more.

## Generic Type Parameters

Make evaluators type-safe with generics:

```python
from dataclasses import dataclass
from typing import TypeVar

from pydantic_evals.evaluators import Evaluator, EvaluatorContext

InputsT = TypeVar('InputsT')
OutputT = TypeVar('OutputT')


@dataclass
class TypedEvaluator(Evaluator[InputsT, OutputT, dict]):
    def evaluate(self, ctx: EvaluatorContext[InputsT, OutputT, dict]) -> bool:
        # ctx.inputs and ctx.output are now properly typed
        return True
```

## Custom Evaluation Names

Control how evaluations appear in reports:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CustomNameEvaluator(Evaluator):
    check_type: str

    def get_default_evaluation_name(self) -> str:
        # Use check_type as the name instead of class name
        return f'{self.check_type}_check'

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# In reports, appears as "format_check" instead of "CustomNameEvaluator"
evaluator = CustomNameEvaluator(check_type='format')
```

Or use the `evaluation_name` field (if using the built-in pattern):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class MyEvaluator(Evaluator):
    evaluation_name: str | None = None

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# Usage
MyEvaluator(evaluation_name='my_custom_name')
```

## Real-World Examples

### SQL Validation

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class ValidSQL(Evaluator):
    dialect: str = 'postgresql'

    def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        try:
            import sqlparse
            parsed = sqlparse.parse(ctx.output)

            if not parsed:
                return EvaluationReason(
                    value=False,
                    reason='Could not parse SQL',
                )

            # Check for dangerous operations
            sql_upper = ctx.output.upper()
            if 'DROP' in sql_upper or 'DELETE' in sql_upper:
                return EvaluationReason(
                    value=False,
                    reason='Contains dangerous operations (DROP/DELETE)',
                )

            return EvaluationReason(
                value=True,
                reason='Valid SQL syntax',
            )
        except Exception as e:
            return EvaluationReason(
                value=False,
                reason=f'SQL parsing error: {e}',
            )
```

### Code Execution

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class ExecutablePython(Evaluator):
    timeout_seconds: float = 5.0

    async def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        import asyncio
        import os
        import tempfile

        # Write code to temp file
        with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
            f.write(ctx.output)
            temp_path = f.name

        try:
            # Execute with timeout
            process = await asyncio.create_subprocess_exec(
                'python', temp_path,
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
            )

            try:
                stdout, stderr = await asyncio.wait_for(
                    process.communicate(),
                    timeout=self.timeout_seconds,
                )
            except asyncio.TimeoutError:
                process.kill()
                return EvaluationReason(
                    value=False,
                    reason=f'Execution timeout after {self.timeout_seconds}s',
                )

            if process.returncode == 0:
                return EvaluationReason(
                    value=True,
                    reason='Code executed successfully',
                )
            else:
                return EvaluationReason(
                    value=False,
                    reason=f'Execution failed: {stderr.decode()}',
                )
        finally:
            os.unlink(temp_path)
```

### External API Validation

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class APIResponseValid(Evaluator):
    api_endpoint: str
    api_key: str

    async def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]:
        import httpx

        try:
            async with httpx.AsyncClient() as client:
                response = await client.post(
                    self.api_endpoint,
                    headers={'Authorization': f'Bearer {self.api_key}'},
                    json={'data': ctx.output},
                    timeout=10.0,
                )

                result = response.json()

                return {
                    'api_reachable': True,
                    'validation_passed': result.get('valid', False),
                    'confidence_score': result.get('confidence', 0.0),
                }
        except Exception:
            return {
                'api_reachable': False,
                'validation_passed': False,
                'confidence_score': 0.0,
            }
```

## Testing Evaluators

Test evaluators like any other Python code:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    """Check if output exactly matches expected output."""

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.output == ctx.expected_output


def test_exact_match():
    evaluator = ExactMatch()

    # Test match
    ctx = EvaluatorContext(
        name='test',
        inputs='input',
        metadata=None,
        expected_output='expected',
        output='expected',
        duration=0.1,
        _span_tree=None,
        attributes={},
        metrics={},
    )
    assert evaluator.evaluate(ctx) is True

    # Test mismatch
    ctx.output = 'different'
    assert evaluator.evaluate(ctx) is False
```

## Best Practices

### 1\. Keep Evaluators Focused

Each evaluator should check one thing:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


def check_format(output: str) -> bool:
    return output.startswith('{')


def check_content(output: str) -> bool:
    return len(output) > 10


def check_length(output: str) -> bool:
    return len(output) < 1000


def check_spelling(output: str) -> bool:
    return True  # Placeholder


# Bad: Doing too much
@dataclass
class EverythingChecker(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict:
        return {
            'format_valid': check_format(ctx.output),
            'content_good': check_content(ctx.output),
            'length_ok': check_length(ctx.output),
            'spelling_correct': check_spelling(ctx.output),
        }


# Good: Separate evaluators
@dataclass
class FormatValidator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return check_format(ctx.output)


@dataclass
class ContentChecker(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return check_content(ctx.output)


@dataclass
class LengthChecker(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return check_length(ctx.output)


@dataclass
class SpellingChecker(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return check_spelling(ctx.output)
```

Some exceptions to this:

-   When there is a significant amount of shared computation or network request latency, it may be better to have a single evaluator calculate all dependent outputs together.
-   If multiple checks are tightly coupled or very closely related to each other, it may make sense to include all their logic in one evaluator.

### 2\. Handle Missing Data Gracefully

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class SafeEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        if ctx.expected_output is None:
            return EvaluationReason(
                value=True,
                reason='Skipped: no expected output provided',
            )

        # Your evaluation logic
        ...
```

### 3\. Provide Helpful Reasons

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class HelpfulEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        # Bad
        return EvaluationReason(value=False, reason='Failed')

        # Good
        return EvaluationReason(
            value=False,
            reason=f'Expected {ctx.expected_output!r}, got {ctx.output!r}',
        )
```

### 4\. Use Timeouts for External Calls

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class APIEvaluator(Evaluator):
    timeout: float = 10.0

    async def _call_api(self, output: str) -> bool:
        # Placeholder for API call
        return True

    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        import asyncio

        try:
            return await asyncio.wait_for(
                self._call_api(ctx.output),
                timeout=self.timeout,
            )
        except asyncio.TimeoutError:
            return False
```

## Next Steps

-   **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, custom tables)
-   **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Using OpenTelemetry spans
-   **[Examples](/docs/ai/evals/examples/simple-validation)** - Practical examples

---

# [Third-Party Integrations](https://pydantic.dev/docs/ai/evals/evaluators/framework-integrations/)

# Third-Party Integrations

Pydantic Evals does not take a hard dependency on any particular metrics framework. When a team already uses [Ragas](https://github.com/explodinggradients/ragas), [DeepEval](https://github.com/confident-ai/deepeval), or another scoring library, the [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) base class makes it straightforward to wrap the upstream metric and run it inside any Pydantic Evals dataset. This page shows worked examples for the common ones.

Prefer a native evaluator where you can

If a rubric-based [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) or a [custom evaluator](/docs/ai/evals/evaluators/custom) covers your use case, that's usually simpler -- zero extra dependencies and the scores slot into reports cleanly. Reach for the integrations below when you specifically want the _exact_ upstream implementation (for reproducibility with published benchmarks, parity with an existing evaluation suite, or features we don't expose natively). You can mix external and native evaluators in one dataset.

## Pattern

Each framework integration follows the same pattern:

1.  Subclass [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator).
2.  Adapt `ctx.inputs`, `ctx.output`, `ctx.expected_output`, and metadata into whatever the upstream metric expects.
3.  Return a `float` score, a `bool` assertion, an [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason), or a `dict` of these.

The rest of this page shows concrete adapters. They are intentionally compact -- extend them with whatever configuration your team needs (model selection, thresholds, per-case toggles).

## Ragas

Install with `pip install ragas` (not included in `pydantic-evals`).

This adapter wraps [`ragas.metrics.Faithfulness`](https://docs.ragas.io/en/stable/concepts/metrics/available_metrics/faithfulness/) for a single-turn sample. Each case is expected to provide the retrieved context as part of its inputs or metadata.

```python
from dataclasses import dataclass

from ragas.dataset_schema import SingleTurnSample
from ragas.metrics import Faithfulness

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class RagasFaithfulness(Evaluator):
    """Wrap `ragas.metrics.Faithfulness` as a Pydantic Evals evaluator."""

    context_field: str = 'context'

    async def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        metadata = ctx.metadata or {}
        retrieved_contexts = metadata.get(self.context_field, [])
        if isinstance(retrieved_contexts, str):
            retrieved_contexts = [retrieved_contexts]

        sample = SingleTurnSample(
            user_input=str(ctx.inputs),
            response=str(ctx.output),
            retrieved_contexts=retrieved_contexts,
        )
        metric = Faithfulness()
        score = await metric.single_turn_ascore(sample)
        return EvaluationReason(value=float(score), reason=f'ragas.Faithfulness = {score:.3f}')
```

Usage is the same as any built-in evaluator:

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='rag_eval',
    cases=[
        Case(
            inputs='What is the capital of France?',
            metadata={'context': ['Paris is the capital of France.']},
        ),
    ],
    evaluators=[RagasFaithfulness()],
)
```

The same pattern works for `ragas.metrics.answer_relevancy`, `context_precision`, and the other scoring metrics: swap the metric class and (if needed) the sample fields.

## DeepEval

Install with `pip install deepeval` (not included in `pydantic-evals`).

This adapter wraps [DeepEval's `GEval` metric](https://docs.confident-ai.com/docs/metrics-llm-evals) to score a criterion against a `LLMTestCase`. DeepEval's `measure` is synchronous, so the evaluator is synchronous too.

```python
from dataclasses import dataclass

from deepeval.metrics import GEval
from deepeval.test_case import LLMTestCase, LLMTestCaseParams

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class DeepEvalGEval(Evaluator):
    """Wrap `deepeval.metrics.GEval` as a Pydantic Evals evaluator."""

    metric_name: str
    criteria: str
    threshold: float = 0.5

    def evaluate(self, ctx: EvaluatorContext) -> dict[str, float | bool | EvaluationReason]:
        test_case = LLMTestCase(
            input=str(ctx.inputs),
            actual_output=str(ctx.output),
            expected_output=None if ctx.expected_output is None else str(ctx.expected_output),
        )
        metric = GEval(
            name=self.metric_name,
            criteria=self.criteria,
            evaluation_params=[LLMTestCaseParams.INPUT, LLMTestCaseParams.ACTUAL_OUTPUT],
            threshold=self.threshold,
        )
        metric.measure(test_case)
        return {
            f'{self.metric_name}_score': EvaluationReason(value=float(metric.score), reason=metric.reason or ''),
            f'{self.metric_name}_pass': bool(metric.success),
        }
```

The same wrapper shape works for DeepEval's `FaithfulnessMetric`, `AnswerRelevancyMetric`, `HallucinationMetric`, and others -- swap the metric class and populate the relevant `LLMTestCase` fields (for example `retrieval_context` for faithfulness).

## Notes on dependencies

-   `ragas` and `deepeval` are optional dependencies -- they are not installed with `pydantic-evals` and are not part of any dependency group. Install them only in projects that use these integrations.
-   Both libraries make their own LLM calls, so be prepared for extra API usage when running a dataset that includes these evaluators.

---

# [LLM Judge](https://pydantic.dev/docs/ai/evals/evaluators/llm-judge/)

# LLM Judge

The [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluator uses an LLM to assess subjective qualities of outputs based on a rubric.

## When to Use LLM-as-a-Judge

LLM judges are ideal for evaluating qualities that require understanding and judgment:

**Good Use Cases:**

-   Factual accuracy
-   Helpfulness and relevance
-   Tone and style compliance
-   Completeness of responses
-   Following complex instructions
-   RAG groundedness (does the answer use provided context?)
-   Citation accuracy

**Poor Use Cases:**

-   Format validation (use [`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance) instead)
-   Exact matching (use [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected))
-   Performance checks (use [`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration))
-   Deterministic logic (write a custom evaluator)

## Basic Usage

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='factual_accuracy',
    cases=[Case(inputs='test')],
    evaluators=[
        LLMJudge(rubric='Response is factually accurate'),
    ],
)
```

## Configuration Options

### Rubric

The `rubric` is your evaluation criteria. Be specific and clear:

**Bad rubrics (vague):**

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(rubric='Good response')  # Too vague
LLMJudge(rubric='Check quality')  # What aspect of quality?
```

**Good rubrics (specific):**

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(rubric='Response directly answers the user question without hallucination')
LLMJudge(rubric='Response uses formal, professional language appropriate for business communication')
LLMJudge(rubric='All factual claims in the response are supported by the provided context')
```

### Including Context

Control what information the judge sees:

```python
from pydantic_evals.evaluators import LLMJudge

# Output only (default)
LLMJudge(rubric='Response is polite')

# Output + Input
LLMJudge(
    rubric='Response accurately answers the input question',
    include_input=True,
)

# Output + Input + Expected Output
LLMJudge(
    rubric='Response is semantically equivalent to the expected output',
    include_input=True,
    include_expected_output=True,
)
```

**Example:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='math_check',
    cases=[
        Case(
            inputs='What is 2+2?',
            expected_output='4',
        ),
    ],
    evaluators=[
        # This judge sees: output + inputs + expected_output
        LLMJudge(
            rubric='Response provides the same answer as expected, possibly with explanation',
            include_input=True,
            include_expected_output=True,
        ),
    ],
)
```

### Model Selection

Choose the judge model based on cost/quality tradeoffs:

```python
from pydantic_evals.evaluators import LLMJudge

# Default: GPT-4o (good balance)
LLMJudge(rubric='...')

# Anthropic Claude (alternative default)
LLMJudge(
    rubric='...',
    model='anthropic:claude-sonnet-4-6',
)

# Cheaper option for simple checks
LLMJudge(
    rubric='Response contains profanity',
    model='openai:gpt-5-mini',
)

# Premium option for nuanced evaluation
LLMJudge(
    rubric='Response demonstrates deep understanding of quantum mechanics',
    model='anthropic:claude-opus-4-5',
)
```

### Model Settings

Customize model behavior:

```python
from pydantic_ai import ModelSettings
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='...',
    model_settings=ModelSettings(
        temperature=0.0,  # Deterministic evaluation
        max_tokens=100,  # Shorter responses
    ),
)
```

## Output Modes

### Assertion Only (Default)

Returns pass/fail with reason:

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(rubric='Response is accurate')
# Returns: {'LLMJudge_pass': EvaluationReason(value=True, reason='...')}
```

In reports:

```
┃ Assertions ┃
┃ ✔          ┃
```

### Score Only

Returns a numeric score (0.0 to 1.0):

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='Response quality',
    score={'include_reason': True},
    assertion=False,
)
# Returns: {'LLMJudge_score': EvaluationReason(value=0.85, reason='...')}
```

In reports:

```
┃ Scores             ┃
┃ LLMJudge_score: 0.85 ┃
```

### Both Score and Assertion

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='Response quality',
    score={'include_reason': True},
    assertion={'include_reason': True},
)
# Returns: {
#     'LLMJudge_score': EvaluationReason(value=0.85, reason='...'),
#     'LLMJudge_pass': EvaluationReason(value=True, reason='...'),
# }
```

### Custom Names

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='Response is factually accurate',
    assertion={
        'evaluation_name': 'accuracy',
        'include_reason': True,
    },
)
# Returns: {'accuracy': EvaluationReason(value=True, reason='...')}
```

In reports:

```
┃ Assertions ┃
┃ accuracy: ✔ ┃
```

## Practical Examples

### RAG Evaluation

Evaluate whether a RAG system uses provided context:

```python
from dataclasses import dataclass

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


@dataclass
class RAGInput:
    question: str
    context: str


dataset = Dataset(
    name='rag_evaluation',
    cases=[
        Case(
            inputs=RAGInput(
                question='What is the capital of France?',
                context='France is a country in Europe. Its capital is Paris.',
            ),
        ),
    ],
    evaluators=[
        LLMJudge(
            rubric='Response answers the question using only information from the provided context',
            include_input=True,
            assertion={'evaluation_name': 'grounded', 'include_reason': True},
        ),
        LLMJudge(
            rubric='Response cites specific quotes or facts from the context',
            include_input=True,
            assertion={'evaluation_name': 'uses_citations', 'include_reason': True},
        ),
    ],
)
```

### Recipe Generation with Case-Specific Rubrics

This example shows how to use both dataset-level and case-specific evaluators:

recipe\_evaluation.py

```python
from __future__ import annotations

from typing import Any

from pydantic import BaseModel

from pydantic_ai import Agent, format_as_xml
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import IsInstance, LLMJudge


class CustomerOrder(BaseModel):
  dish_name: str
  dietary_restriction: str | None = None


class Recipe(BaseModel):
  ingredients: list[str]
  steps: list[str]


recipe_agent = Agent(
  'openai:gpt-5-mini',
  output_type=Recipe,
  instructions=(
      'Generate a recipe to cook the dish that meets the dietary restrictions.'
  ),
)


async def transform_recipe(customer_order: CustomerOrder) -> Recipe:
  r = await recipe_agent.run(format_as_xml(customer_order))
  return r.output


recipe_dataset = Dataset[CustomerOrder, Recipe, Any](
  name='recipe_evaluation',
  cases=[
      Case(
          name='vegetarian_recipe',
          inputs=CustomerOrder(
              dish_name='Spaghetti Bolognese', dietary_restriction='vegetarian'
          ),
          expected_output=None,
          metadata={'focus': 'vegetarian'},
          evaluators=(  # (1)
              LLMJudge(
                  rubric='Recipe should not contain meat or animal products',
              ),
          ),
      ),
      Case(
          name='gluten_free_recipe',
          inputs=CustomerOrder(
              dish_name='Chocolate Cake', dietary_restriction='gluten-free'
          ),
          expected_output=None,
          metadata={'focus': 'gluten-free'},
          evaluators=(  # (2)
              LLMJudge(
                  rubric='Recipe should not contain gluten or wheat products',
              ),
          ),
      ),
  ],
  evaluators=[  # (3)
      IsInstance(type_name='Recipe'),
      LLMJudge(
          rubric='Recipe should have clear steps and relevant ingredients',
          include_input=True,
          model='anthropic:claude-sonnet-4-6',
      ),
  ],
)


report = recipe_dataset.evaluate_sync(transform_recipe)
print(report)
"""
   Evaluation Summary: transform_recipe
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID            ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ vegetarian_recipe  │ ✔✔✔        │    38.1s │
├────────────────────┼────────────┼──────────┤
│ gluten_free_recipe │ ✔✔✔        │    22.4s │
├────────────────────┼────────────┼──────────┤
│ Averages           │ 100.0% ✔   │    30.3s │
└────────────────────┴────────────┴──────────┘
"""
```

Case-specific evaluator - only runs for the vegetarian recipe case

Case-specific evaluator - only runs for the gluten-free recipe case

Dataset-level evaluators - run for all cases

### Multi-Aspect Evaluation

Use multiple judges for different quality dimensions:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='multi_aspect',
    cases=[Case(inputs='test')],
    evaluators=[
        # Accuracy
        LLMJudge(
            rubric='Response is factually accurate',
            include_input=True,
            assertion={'evaluation_name': 'accurate'},
        ),

        # Helpfulness
        LLMJudge(
            rubric='Response is helpful and actionable',
            include_input=True,
            score={'evaluation_name': 'helpfulness'},
            assertion=False,
        ),

        # Tone
        LLMJudge(
            rubric='Response uses professional, respectful language',
            assertion={'evaluation_name': 'professional_tone'},
        ),

        # Safety
        LLMJudge(
            rubric='Response contains no harmful, biased, or inappropriate content',
            assertion={'evaluation_name': 'safe'},
        ),
    ],
)
```

### Comparative Evaluation

Compare output against expected output:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='comparative_eval',
    cases=[
        Case(
            name='translation',
            inputs='Hello world',
            expected_output='Bonjour le monde',
        ),
    ],
    evaluators=[
        LLMJudge(
            rubric='Response is semantically equivalent to the expected output',
            include_input=True,
            include_expected_output=True,
            score={'evaluation_name': 'semantic_similarity'},
            assertion={'evaluation_name': 'correct_meaning'},
        ),
    ],
)
```

## Best Practices

### 1\. Be Specific in Rubrics

**Bad:**

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(rubric='Good answer')
```

**Better:**

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(rubric='Response accurately answers the question without hallucinating facts')
```

**Best:**

```python
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='''
    Response must:
    1. Directly answer the question asked
    2. Use only information from the provided context
    3. Cite specific passages from the context
    4. Acknowledge if information is insufficient
    ''',
    include_input=True,
)
```

### 2\. Use Multiple Judges

Don't always try to evaluate everything with one rubric:

```python
from pydantic_evals.evaluators import LLMJudge

# Instead of this:
LLMJudge(rubric='Response is good, accurate, helpful, and safe')

# Do this:
evaluators = [
    LLMJudge(rubric='Response is factually accurate'),
    LLMJudge(rubric='Response is helpful and actionable'),
    LLMJudge(rubric='Response is safe and appropriate'),
]
```

### 3\. Combine with Deterministic Checks

Don't use LLM evaluation for checks that can be done deterministically:

```python
from pydantic_evals.evaluators import Contains, IsInstance, LLMJudge

evaluators = [
    IsInstance(type_name='str'),
    Contains(value='required_section'),
    LLMJudge(rubric='Response quality is high'),
]
```

### 4\. Use Temperature 0 for Consistency

```python
from pydantic_ai import ModelSettings
from pydantic_evals.evaluators import LLMJudge

LLMJudge(
    rubric='...',
    model_settings=ModelSettings(temperature=0.0),
)
```

## Limitations

### Non-Determinism

LLM judges are not deterministic. The same output may receive different scores across runs.

**Mitigation:**

-   Use `temperature=0.0` for more consistency
-   Run multiple evaluations and average
-   Use retry strategies for flaky evaluations

### Cost

LLM judges make API calls, which cost money and time.

**Mitigation:**

-   Use cheaper models for simple checks (`gpt-5-mini`)
-   Run deterministic checks first to fail fast
-   Cache results when possible
-   Limit evaluation to changed cases

### Model Biases

LLM judges inherit biases from their training data.

**Mitigation:**

-   Use multiple judge models and compare
-   Review evaluation reasons, not just scores
-   Validate judges against human-labeled test sets
-   Be aware of known biases (length bias, style preferences)

### Context Limits

Judges have token limits for inputs.

**Mitigation:**

-   Truncate long inputs/outputs intelligently
-   Use focused rubrics that don't require full context
-   Consider chunked evaluation for very long content

## Debugging LLM Judges

### View Reasons

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(
    name='debug_reasons',
    cases=[Case(inputs='test')],
    evaluators=[LLMJudge(rubric='Response is clear')],
)
report = dataset.evaluate_sync(my_task)
report.print(include_reasons=True)
"""
     Evaluation Summary: my_task
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID  ┃ Assertions  ┃ Duration ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Case 1   │ LLMJudge: ✔ │     10ms │
│          │   Reason: - │          │
│          │             │          │
│          │             │          │
├──────────┼─────────────┼──────────┤
│ Averages │ 100.0% ✔    │     10ms │
└──────────┴─────────────┴──────────┘
"""
```

Output:

```
┃ Assertions              ┃
┃ accuracy: ✔            ┃
┃   Reason: The response │
┃   correctly states...  │
```

### Access Programmatically

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(
    name='programmatic_access',
    cases=[Case(inputs='test')],
    evaluators=[LLMJudge(rubric='Response is clear')],
)
report = dataset.evaluate_sync(my_task)
for case in report.cases:
    for name, result in case.assertions.items():
        print(f'{name}: {result.value}')
        #> LLMJudge: True
        if result.reason:
            print(f'  Reason: {result.reason}')
            #>   Reason: -
```

### Compare Judges

Test the same cases with different judge models:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


judges = [
    LLMJudge(rubric='Response is clear', model='openai:gpt-5.2'),
    LLMJudge(rubric='Response is clear', model='anthropic:claude-sonnet-4-6'),
    LLMJudge(rubric='Response is clear', model='openai:gpt-5-mini'),
]

for judge in judges:
    dataset = Dataset(name='judge_comparison', cases=[Case(inputs='test')], evaluators=[judge])
    report = dataset.evaluate_sync(my_task)
    # Compare results
```

## Advanced: Custom Judge Models

Set a default judge model for all `LLMJudge` evaluators:

```python
from pydantic_evals.evaluators import LLMJudge
from pydantic_evals.evaluators.llm_as_a_judge import set_default_judge_model

# Set default to Claude
set_default_judge_model('anthropic:claude-sonnet-4-6')

# Now all LLMJudge instances use Claude by default
LLMJudge(rubric='...')  # Uses Claude
```

## Next Steps

-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write custom evaluation logic
-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete evaluator reference

---

# [Overview](https://pydantic.dev/docs/ai/evals/evaluators/overview/)

# Overview

Evaluators are the core of Pydantic Evals. They analyze task outputs and provide scores, labels, or pass/fail assertions.

## When to Use Different Evaluators

### Deterministic Checks (Fast & Reliable)

Use deterministic evaluators when you can define exact rules:

Evaluator

Use Case

Example

[`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected)

Exact output match

Structured data, classification

[`Equals`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Equals)

Equals specific value

Checking for sentinel values

[`Contains`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Contains)

Substring/element check

Required keywords, PII detection

[`IsInstance`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance)

Type validation

Format validation

[`MaxDuration`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.MaxDuration)

Performance threshold

SLA compliance

[`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan)

Behavior verification

Tool calls, code paths

**Advantages:**

-   Fast execution (microseconds to milliseconds)
-   Deterministic results
-   No cost
-   Easy to debug

**When to use:**

-   Format validation (JSON structure, type checking)
-   Required content checks (must contain X, must not contain Y)
-   Performance requirements (latency, token counts)
-   Behavioral checks (which tools were called, which code paths executed)

### LLM-as-a-Judge (Flexible & Nuanced)

Use [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) when evaluation requires understanding or judgment:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='llm_judge_example',
    cases=[Case(inputs='What is 2+2?', expected_output='4')],
    evaluators=[
        LLMJudge(
            rubric='Response is factually accurate based on the input',
            include_input=True,
        )
    ],
)
```

**Advantages:**

-   Can evaluate subjective qualities (helpfulness, tone, creativity)
-   Understands natural language
-   Can follow complex rubrics
-   Flexible across domains

**Disadvantages:**

-   Slower (seconds per evaluation)
-   Costs money
-   Non-deterministic
-   Can have biases

**When to use:**

-   Factual accuracy
-   Relevance and helpfulness
-   Tone and style
-   Completeness
-   Following instructions
-   RAG quality (groundedness, citation accuracy)

### Custom Evaluators

Custom evaluators can be useful if you want to make use of any evaluation logic we don't provide with the framework. They are frequently useful for domain-specific logic:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ValidSQL(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        try:
            import sqlparse
            sqlparse.parse(ctx.output)
            return True
        except Exception:
            return False
```

**When to use:**

-   Domain-specific validation (SQL syntax, regex patterns, business rules)
-   External API calls (running generated code, checking databases)
-   Complex calculations (precision/recall, BLEU scores)
-   Integration checks (does API call succeed?)

## Evaluation Types

Detailed Return Types Guide

For full detail about precisely what custom Evaluators may return, see [Custom Evaluator Return Types](/docs/ai/evals/evaluators/custom#return-types).

Evaluators essentially return three types of results:

### 1\. Assertions (bool)

Pass/fail checks that appear as ✔ or ✗ in reports:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class HasKeyword(Evaluator):
    keyword: str

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return self.keyword in ctx.output
```

**Use for:** Binary checks, quality gates, compliance requirements

### 2\. Scores (int or float)

Numeric metrics:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ConfidenceScore(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> float:
        # Analyze and return score
        return 0.87  # 87% confidence
```

**Use for:** Quality metrics, ranking, A/B testing, regression tracking

### 3\. Labels (str)

Categorical classifications:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class SentimentClassifier(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> str:
        if 'error' in ctx.output.lower():
            return 'error'
        elif 'success' in ctx.output.lower():
            return 'success'
        return 'neutral'
```

**Use for:** Classification, error categorization, quality buckets

### Multiple Results

You can return multiple evaluations from a single evaluator:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ComprehensiveCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float | str]:
        return {
            'valid_format': self._check_format(ctx.output),  # bool
            'quality_score': self._score_quality(ctx.output),  # float
            'category': self._classify(ctx.output),  # str
        }

    def _check_format(self, output: str) -> bool:
        return True

    def _score_quality(self, output: str) -> float:
        return 0.85

    def _classify(self, output: str) -> str:
        return 'good'
```

## Combining Evaluators

Mix and match evaluators to create comprehensive evaluation suites:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    Contains,
    IsInstance,
    LLMJudge,
    MaxDuration,
)

dataset = Dataset(
    name='layered_evaluation',
    cases=[Case(inputs='test', expected_output='result')],
    evaluators=[
        # Fast deterministic checks first
        IsInstance(type_name='str'),
        Contains(value='required_field'),
        MaxDuration(seconds=2.0),
        # Slower LLM checks after
        LLMJudge(
            rubric='Response is accurate and helpful',
            include_input=True,
        ),
    ],
)
```

## Case-specific evaluators

Case-specific evaluators are one of the most powerful features for building comprehensive evaluation suites. You can attach evaluators to individual [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) objects that only run for those specific cases:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import IsInstance, LLMJudge

dataset = Dataset(
    name='case_specific_evaluators',
    cases=[
        Case(
            name='greeting_response',
            inputs='Say hello',
            evaluators=[
                # This evaluator only runs for this case
                LLMJudge(
                    rubric='Response is warm and friendly, uses casual tone',
                    include_input=True,
                ),
            ],
        ),
        Case(
            name='formal_response',
            inputs='Write a business email',
            evaluators=[
                # Different requirements for this case
                LLMJudge(
                    rubric='Response is professional and formal, uses business language',
                    include_input=True,
                ),
            ],
        ),
    ],
    evaluators=[
        # This runs for ALL cases
        IsInstance(type_name='str'),
    ],
)
```

### Why Case-Specific Evaluators Matter

Case-specific evaluators solve a fundamental problem with one-size-fits-all evaluation: **if you could write a single evaluator rubric that perfectly captured your requirements across all cases, you'd just incorporate that rubric into your agent's instructions**. (Note: this is less relevant in cases where you want to use a cheaper model in production and assess it using a more expensive model, but in many cases it makes sense to use the best model you can in production.)

The power of case-specific evaluation comes from the nuance:

-   **Different cases have different requirements**: A customer support response needs empathy; a technical API response needs precision
-   **Avoid "inmates running the asylum"**: If your LLMJudge rubric is generic enough to work everywhere, your agent should already be following it
-   **Capture nuanced golden behavior**: Each case can specify exactly what "good" looks like for that scenario

### Building Golden Datasets with Case-Specific LLMJudge

A particularly powerful pattern is using case-specific [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators to quickly build comprehensive, maintainable evaluation suites. Instead of needing exact `expected_output` values, you can describe what you care about:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='golden_dataset',
    cases=[
        Case(
            name='handle_refund_request',
            inputs={'query': 'I want my money back', 'order_id': '12345'},
            evaluators=[
                LLMJudge(
                    rubric="""
                    Response should:
                    1. Acknowledge the refund request empathetically
                    2. Ask for the reason for the refund
                    3. Mention our 30-day refund policy
                    4. NOT process the refund immediately (needs manager approval)
                    """,
                    include_input=True,
                ),
            ],
        ),
        Case(
            name='handle_shipping_question',
            inputs={'query': 'Where is my order?', 'order_id': '12345'},
            evaluators=[
                LLMJudge(
                    rubric="""
                    Response should:
                    1. Confirm the order number
                    2. Provide tracking information
                    3. Give estimated delivery date
                    4. Be brief and factual (not overly apologetic)
                    """,
                    include_input=True,
                ),
            ],
        ),
        Case(
            name='handle_angry_customer',
            inputs={'query': 'This is completely unacceptable!', 'order_id': '12345'},
            evaluators=[
                LLMJudge(
                    rubric="""
                    Response should:
                    1. Prioritize de-escalation with empathy
                    2. Avoid being defensive
                    3. Offer concrete next steps
                    4. Use phrases like "I understand" and "Let me help"
                    """,
                    include_input=True,
                ),
            ],
        ),
    ],
)
```

This approach lets you:

-   **Build comprehensive test suites quickly**: Just describe what you want per case
-   **Maintain easily**: Update rubrics as requirements change, without regenerating outputs
-   **Cover edge cases naturally**: Add new cases with specific requirements as you discover them
-   **Capture domain knowledge**: Each rubric documents what "good" means for that scenario

The LLM evaluator excels at understanding nuanced requirements and assessing compliance, making this a practical way to create thorough evaluation coverage without brittleness.

## Async vs Sync

Evaluators can be sync or async:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class SyncEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


async def some_async_operation() -> bool:
    return True


@dataclass
class AsyncEvaluator(Evaluator):
    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        result = await some_async_operation()
        return result
```

Pydantic Evals handles both automatically. Use async when:

-   Making API calls
-   Running database queries
-   Performing I/O operations
-   Calling LLMs (like [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge))

## Evaluation Context

All evaluators receive an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext):

-   `ctx.inputs` - Task inputs
-   `ctx.output` - Task output (to evaluate)
-   `ctx.expected_output` - Expected output (if provided)
-   `ctx.metadata` - Case metadata (if provided)
-   `ctx.duration` - Task execution time (seconds)
-   `ctx.span_tree` - OpenTelemetry spans (if logfire configured)
-   `ctx.metrics` - Custom metrics dict
-   `ctx.attributes` - Custom attributes dict

This gives evaluators full context to make informed assessments.

## Error Handling

If an evaluator raises an exception, it's captured as an [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


def risky_operation(output: str) -> bool:
    # This might raise an exception
    if 'error' in output:
        raise ValueError('Found error in output')
    return True


@dataclass
class RiskyEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        # If this raises an exception, it will be captured
        result = risky_operation(ctx.output)
        return result
```

Failures appear in `report.cases[i].evaluator_failures` with:

-   Evaluator name
-   Error message
-   Full stacktrace

Use retry configuration to handle transient failures (see [Retry Strategies](/docs/ai/evals/how-to/retry-strategies)).

## Report Evaluators (Experiment-Wide)

All the evaluators above run once per case. **Report evaluators** are different: they run once per experiment after all cases have been evaluated, and analyze the full set of results together.

Use report evaluators for experiment-wide statistics like:

-   **Confusion matrices** -- visualize classification accuracy across classes
-   **Precision-recall curves** -- assess ranking quality with AUC scores
-   **Scalar metrics** -- overall accuracy, F1, BLEU, or any single number
-   **Summary tables** -- per-class breakdowns, error category summaries

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import ConfusionMatrixEvaluator

dataset = Dataset(
    name='report_evaluator_example',
    cases=[
        Case(inputs='meow', expected_output='cat'),
        Case(inputs='woof', expected_output='dog'),
    ],
    report_evaluators=[
        ConfusionMatrixEvaluator(
            predicted_from='output',
            expected_from='expected_output',
        ),
    ],
)
```

**See:** [Report Evaluators](/docs/ai/evals/evaluators/report-evaluators) for the full guide, including built-in report evaluators and how to write custom ones.

## Next Steps

-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete reference of all provided evaluators
-   **[LLM Judge](/docs/ai/evals/evaluators/llm-judge)** - Deep dive on LLM-as-a-Judge evaluation
-   **[Third-Party Integrations](/docs/ai/evals/evaluators/framework-integrations)** - Wrap Ragas, DeepEval, and other metrics libraries
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic
-   **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses
-   **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Evaluate using OpenTelemetry spans

---

# [Report Evaluators](https://pydantic.dev/docs/ai/evals/evaluators/report-evaluators/)

# Report Evaluators

Report evaluators analyze entire experiment results rather than individual cases. Use them to compute experiment-wide statistics like confusion matrices, precision-recall curves, accuracy scores, or custom summary tables.

## How Report Evaluators Work

Regular [evaluators](/docs/ai/evals/evaluators/overview) run once per case and assess individual outputs. Report evaluators run once per experiment _after_ all cases have been evaluated, receiving the full [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) as input.

```
Cases executed → Case evaluators run → Report evaluators run → Final report
```

Results from report evaluators are stored as **analyses** on the report and, when Logfire is configured, are attached to the experiment span as structured attributes for visualization.

## Using Report Evaluators

Pass report evaluators to `Dataset` via the `report_evaluators` parameter:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import ConfusionMatrixEvaluator


def my_classifier(text: str) -> str:
    text = text.lower()
    if 'cat' in text or 'meow' in text:
        return 'cat'
    elif 'dog' in text or 'bark' in text:
        return 'dog'
    return 'unknown'


dataset = Dataset(
    name='animal_classifier',
    cases=[
        Case(name='cat', inputs='The cat goes meow', expected_output='cat'),
        Case(name='dog', inputs='The dog barks', expected_output='dog'),
    ],
    report_evaluators=[
        ConfusionMatrixEvaluator(
            predicted_from='output',
            expected_from='expected_output',
            title='Animal Classification',
        ),
    ],
)

report = dataset.evaluate_sync(my_classifier)
# report.analyses contains the ConfusionMatrix result
```

## Built-in Report Evaluators

### ConfusionMatrixEvaluator

Builds a confusion matrix comparing predicted vs expected labels across all cases.

```python
from pydantic_evals.evaluators import ConfusionMatrixEvaluator

ConfusionMatrixEvaluator(
    predicted_from='output',
    expected_from='expected_output',
    title='My Confusion Matrix',
)
```

**Parameters:**

Parameter

Type

Default

Description

`predicted_from`

`'expected_output' | 'output' | 'metadata' | 'labels'`

`'output'`

Source for predicted values

`predicted_key`

`str | None`

`None`

Key to extract when using `metadata` or `labels`

`expected_from`

`'expected_output' | 'output' | 'metadata' | 'labels'`

`'expected_output'`

Source for expected/true values

`expected_key`

`str | None`

`None`

Key to extract when using `metadata` or `labels`

`title`

`str`

`'Confusion Matrix'`

Title shown in reports

**Returns:** [`ConfusionMatrix`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ConfusionMatrix)

**Data Sources:**

-   `'output'` -- the task's actual output (converted to string)
-   `'expected_output'` -- the case's expected output (converted to string)
-   `'metadata'` -- a value from the case's metadata dict (requires `key`)
-   `'labels'` -- a label result from a case-level evaluator (requires `key`)

**Example -- classification with expected outputs:**

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import ConfusionMatrixEvaluator

dataset = Dataset(
    name='animal_sounds',
    cases=[
        Case(inputs='meow', expected_output='cat'),
        Case(inputs='woof', expected_output='dog'),
        Case(inputs='chirp', expected_output='bird'),
    ],
    report_evaluators=[
        ConfusionMatrixEvaluator(
            predicted_from='output',
            expected_from='expected_output',
        ),
    ],
)
```

**Example -- using evaluator labels:**

If a case-level evaluator produces a label like `predicted_class`, you can reference it:

```python
from dataclasses import dataclass

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    ConfusionMatrixEvaluator,
    Evaluator,
    EvaluatorContext,
)


@dataclass
class ClassifyOutput(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, str]:
        # Classify the output into a category
        return {'predicted_class': categorize(ctx.output)}


def categorize(output: str) -> str:
    return 'positive' if 'good' in output.lower() else 'negative'


dataset = Dataset(
    name='labels_example',
    cases=[Case(inputs='test', expected_output='positive')],
    evaluators=[ClassifyOutput()],
    report_evaluators=[
        ConfusionMatrixEvaluator(
            predicted_from='labels',
            predicted_key='predicted_class',
            expected_from='expected_output',
        ),
    ],
)
```

* * *

### PrecisionRecallEvaluator

Computes a precision-recall curve with AUC (area under the curve) from numeric scores and binary ground-truth labels.

```python
from pydantic_evals.evaluators import PrecisionRecallEvaluator

PrecisionRecallEvaluator(
    score_from='scores',
    score_key='confidence',
    positive_from='assertions',
    positive_key='is_correct',
)
```

**Parameters:**

Parameter

Type

Default

Description

`score_key`

`str`

_(required)_

Key in scores or metrics dict

`positive_from`

`'expected_output' | 'assertions' | 'labels'`

_(required)_

Source for ground-truth binary labels

`positive_key`

`str | None`

`None`

Key in assertions or labels dict

`score_from`

`'scores' | 'metrics'`

`'scores'`

Source for numeric scores

`title`

`str`

`'Precision-Recall Curve'`

Title shown in reports

`n_thresholds`

`int`

`100`

Number of threshold points on the curve

**Returns:** [`PrecisionRecall`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.PrecisionRecall) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (AUC)

The AUC is computed at full resolution (using every unique score as a threshold) for accuracy, then the curve points are downsampled to `n_thresholds` for display. The AUC is returned both on the curve (for chart rendering) and as a separate `ScalarResult` for querying and sorting.

**Score Sources:**

-   `'scores'` -- a numeric score from a case-level evaluator (looked up by `score_key`)
-   `'metrics'` -- a custom metric set during task execution (looked up by `score_key`)

**Positive Sources:**

-   `'assertions'` -- a boolean assertion from a case-level evaluator (looked up by `positive_key`)
-   `'labels'` -- a label result cast to boolean (looked up by `positive_key`)
-   `'expected_output'` -- the case's expected output cast to boolean

**Example:**

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    Evaluator,
    EvaluatorContext,
    PrecisionRecallEvaluator,
)


@dataclass
class ConfidenceEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, Any]:
        confidence = calculate_confidence(ctx.output)
        return {
            'confidence': confidence,      # numeric score
            'is_correct': ctx.output == ctx.expected_output,  # boolean assertion
        }


def calculate_confidence(output: str) -> float:
    return 0.85  # placeholder


dataset = Dataset(
    name='precision_recall_example',
    cases=[
        Case(inputs='test 1', expected_output='cat'),
        Case(inputs='test 2', expected_output='dog'),
    ],
    evaluators=[ConfidenceEvaluator()],
    report_evaluators=[
        PrecisionRecallEvaluator(
            score_from='scores',
            score_key='confidence',
            positive_from='assertions',
            positive_key='is_correct',
        ),
    ],
)
```

* * *

### ROCAUCEvaluator

Computes an ROC (Receiver Operating Characteristic) curve and AUC from numeric scores and binary ground-truth labels. The ROC curve plots the True Positive Rate against the False Positive Rate at various threshold values, with a dashed random-baseline diagonal for reference.

```python
from pydantic_evals.evaluators import ROCAUCEvaluator

ROCAUCEvaluator(
    score_key='confidence',
    positive_from='assertions',
    positive_key='is_correct',
)
```

**Parameters:**

Parameter

Type

Default

Description

`score_key`

`str`

_(required)_

Key in scores or metrics dict

`positive_from`

`'expected_output' | 'assertions' | 'labels'`

_(required)_

Source for ground-truth binary labels

`positive_key`

`str | None`

`None`

Key in assertions or labels dict

`score_from`

`'scores' | 'metrics'`

`'scores'`

Source for numeric scores

`title`

`str`

`'ROC Curve'`

Title shown in reports

`n_thresholds`

`int`

`100`

Number of threshold points on the curve

**Returns:** [`LinePlot`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.LinePlot) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (AUC)

The AUC is computed at full resolution. The chart includes a dashed "Random" baseline diagonal from (0, 0) to (1, 1) for visual comparison.

**Score and Positive Sources:** Same as [`PrecisionRecallEvaluator`](#precisionrecallevaluator).

* * *

### KolmogorovSmirnovEvaluator

Computes a Kolmogorov-Smirnov plot and KS statistic from numeric scores and binary ground-truth labels. The KS plot shows the empirical CDFs (cumulative distribution functions) of the score distribution for positive and negative cases. The KS statistic is the maximum vertical distance between the two CDFs -- higher values indicate better class separation.

```python
from pydantic_evals.evaluators import KolmogorovSmirnovEvaluator

KolmogorovSmirnovEvaluator(
    score_key='confidence',
    positive_from='assertions',
    positive_key='is_correct',
)
```

**Parameters:**

Parameter

Type

Default

Description

`score_key`

`str`

_(required)_

Key in scores or metrics dict

`positive_from`

`'expected_output' | 'assertions' | 'labels'`

_(required)_

Source for ground-truth binary labels

`positive_key`

`str | None`

`None`

Key in assertions or labels dict

`score_from`

`'scores' | 'metrics'`

`'scores'`

Source for numeric scores

`title`

`str`

`'KS Plot'`

Title shown in reports

`n_thresholds`

`int`

`100`

Number of threshold points on the curve

**Returns:** [`LinePlot`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.LinePlot) + [`ScalarResult`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ScalarResult) (KS Statistic)

**Score and Positive Sources:** Same as [`PrecisionRecallEvaluator`](#precisionrecallevaluator).

* * *

## Custom Report Evaluators

Write custom report evaluators by inheriting from [`ReportEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ReportEvaluator) and implementing the `evaluate` method:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext
from pydantic_evals.reporting.analyses import ScalarResult


@dataclass
class AccuracyEvaluator(ReportEvaluator):
    """Computes overall accuracy as a scalar metric."""

    def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
        cases = ctx.report.cases
        if not cases:
            return ScalarResult(title='Accuracy', value=0.0, unit='%')

        correct = sum(
            1 for case in cases
            if case.output == case.expected_output
        )
        accuracy = correct / len(cases) * 100
        return ScalarResult(title='Accuracy', value=accuracy, unit='%')
```

### ReportEvaluatorContext

The context passed to `evaluate()` contains:

-   `ctx.name` -- the experiment name
-   `ctx.report` -- the full [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) with all case results
-   `ctx.experiment_metadata` -- optional experiment-level metadata dict

Through `ctx.report.cases`, you can access each case's inputs, outputs, expected outputs, scores, labels, assertions, metrics, and attributes.

### Return Types

Report evaluators must return a `ReportAnalysis` or a `list[ReportAnalysis]`. The available analysis types are:

#### ScalarResult

A single numeric statistic:

```python
from pydantic_evals.reporting.analyses import ScalarResult

ScalarResult(
    title='Accuracy',
    value=93.3,
    unit='%',
    description='Percentage of correctly classified cases.',
)
```

Field

Type

Description

`title`

`str`

Display name

`value`

`float | int`

The numeric value

`unit`

`str | None`

Optional unit label (e.g., `'%'`, `'ms'`)

`description`

`str | None`

Optional longer description

* * *

#### TableResult

A generic table of data:

```python
from pydantic_evals.reporting.analyses import TableResult

TableResult(
    title='Per-Class Metrics',
    columns=['Class', 'Precision', 'Recall', 'F1'],
    rows=[
        ['cat', 0.95, 0.90, 0.924],
        ['dog', 0.88, 0.92, 0.899],
    ],
    description='Precision, recall, and F1 per class.',
)
```

Field

Type

Description

`title`

`str`

Display name

`columns`

`list[str]`

Column headers

`rows`

`list[list[str | int | float | bool | None]]`

Row data

`description`

`str | None`

Optional longer description

* * *

#### ConfusionMatrix

A confusion matrix (typically produced by `ConfusionMatrixEvaluator`, but can be constructed directly):

```python
from pydantic_evals.reporting.analyses import ConfusionMatrix

ConfusionMatrix(
    title='Sentiment',
    class_labels=['positive', 'negative', 'neutral'],
    matrix=[
        [45, 3, 2],   # expected=positive
        [5, 40, 5],   # expected=negative
        [1, 2, 47],   # expected=neutral
    ],
)
```

Field

Type

Description

`title`

`str`

Display name

`class_labels`

`list[str]`

Ordered labels for both axes

`matrix`

`list[list[int]]`

`matrixexpected` = count

`description`

`str | None`

Optional longer description

* * *

#### PrecisionRecall

Precision-recall curve data (typically produced by `PrecisionRecallEvaluator`):

Field

Type

Description

`title`

`str`

Display name

`curves`

`list[PrecisionRecallCurve]`

One or more curves

`description`

`str | None`

Optional longer description

Each `PrecisionRecallCurve` contains a `name`, a list of `PrecisionRecallPoint`s (with `threshold`, `precision`, `recall`), and an optional `auc` value.

* * *

#### LinePlot

A generic XY line chart with labeled axes, supporting multiple curves. Use this for ROC curves, KS plots, calibration curves, or any custom line chart:

```python
from pydantic_evals.reporting.analyses import LinePlot, LinePlotCurve, LinePlotPoint

LinePlot(
    title='ROC Curve',
    x_label='False Positive Rate',
    y_label='True Positive Rate',
    x_range=(0, 1),
    y_range=(0, 1),
    curves=[
        LinePlotCurve(
            name='Model (AUC: 0.95)',
            points=[LinePlotPoint(x=0.0, y=0.0), LinePlotPoint(x=0.1, y=0.8), LinePlotPoint(x=1.0, y=1.0)],
        ),
        LinePlotCurve(
            name='Random',
            points=[LinePlotPoint(x=0, y=0), LinePlotPoint(x=1, y=1)],
            style='dashed',
        ),
    ],
)
```

Field

Type

Description

`title`

`str`

Display name

`x_label`

`str`

Label for the x-axis

`y_label`

`str`

Label for the y-axis

`x_range`

`tuple[float, float] | None`

Optional fixed range for x-axis

`y_range`

`tuple[float, float] | None`

Optional fixed range for y-axis

`curves`

`list[LinePlotCurve]`

One or more curves to plot

`description`

`str | None`

Optional longer description

Each `LinePlotCurve` contains a `name`, a list of `LinePlotPoint`s (with `x`, `y`), an optional `style` (`'solid'` or `'dashed'`), and an optional `step` interpolation mode (`'start'`, `'middle'`, or `'end'`) for step functions like empirical CDFs.

`LinePlot` is the recommended return type for custom curve-based evaluators -- any evaluator that returns a `LinePlot` will be rendered as a line chart in the Logfire UI without requiring any frontend changes.

### Returning Multiple Analyses

A single report evaluator can return multiple analyses by returning a list:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext
from pydantic_evals.reporting.analyses import ReportAnalysis, ScalarResult, TableResult


@dataclass
class ClassificationSummary(ReportEvaluator):
    """Produces both a scalar accuracy and a per-class metrics table."""

    def evaluate(self, ctx: ReportEvaluatorContext) -> list[ReportAnalysis]:
        cases = ctx.report.cases
        if not cases:
            return []

        labels = sorted({str(c.expected_output) for c in cases if c.expected_output})

        # Scalar: overall accuracy
        correct = sum(1 for c in cases if c.output == c.expected_output)
        accuracy = ScalarResult(
            title='Accuracy', value=correct / len(cases) * 100, unit='%'
        )

        # Table: per-class breakdown
        rows = []
        for label in labels:
            tp = sum(1 for c in cases if str(c.output) == label and str(c.expected_output) == label)
            fp = sum(1 for c in cases if str(c.output) == label and str(c.expected_output) != label)
            fn = sum(1 for c in cases if str(c.output) != label and str(c.expected_output) == label)
            p = tp / (tp + fp) if (tp + fp) > 0 else 0.0
            r = tp / (tp + fn) if (tp + fn) > 0 else 0.0
            f1 = 2 * p * r / (p + r) if (p + r) > 0 else 0.0
            rows.append([label, round(p, 3), round(r, 3), round(f1, 3)])

        table = TableResult(
            title='Per-Class Metrics',
            columns=['Class', 'Precision', 'Recall', 'F1'],
            rows=rows,
        )

        return [accuracy, table]
```

### Async Report Evaluators

Report evaluators support async `evaluate` methods, handled automatically via `evaluate_async`:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import ReportEvaluator, ReportEvaluatorContext
from pydantic_evals.reporting.analyses import ScalarResult


@dataclass
class AsyncAccuracy(ReportEvaluator):
    async def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
        # Can use async I/O here (e.g., call an external API)
        cases = ctx.report.cases
        correct = sum(1 for c in cases if c.output == c.expected_output)
        return ScalarResult(
            title='Accuracy',
            value=correct / len(cases) * 100 if cases else 0.0,
            unit='%',
        )
```

## Serialization

Report evaluators are serialized to and from YAML/JSON dataset files using the same format as case-level evaluators. This means datasets with report evaluators can be fully round-tripped through file serialization.

**Example YAML dataset with report evaluators:**

```yaml
# yaml-language-server: $schema=./test_cases_schema.json
name: classifier_eval
cases:
  - name: cat_test
    inputs: The cat meows
    expected_output: cat
  - name: dog_test
    inputs: The dog barks
    expected_output: dog
report_evaluators:
  - ConfusionMatrixEvaluator
  - PrecisionRecallEvaluator:
      score_key: confidence
      positive_from: assertions
      positive_key: is_correct
```

Built-in report evaluators (`ConfusionMatrixEvaluator`, `PrecisionRecallEvaluator`, `ROCAUCEvaluator`, `KolmogorovSmirnovEvaluator`) are recognized automatically. For custom report evaluators, pass them via `custom_report_evaluator_types`:

```python
from pydantic_evals import Dataset

dataset = Dataset[str, str, None].from_file(
    'test_cases.yaml',
    custom_report_evaluator_types=[MyCustomReportEvaluator],
)
```

Similarly, when saving a dataset with custom report evaluators, pass them to `to_file` so the JSON schema includes them:

```python
dataset.to_file(
    'test_cases.yaml',
    custom_report_evaluator_types=[MyCustomReportEvaluator],
)
```

## Viewing Analyses in Logfire

When [Logfire is configured](/docs/ai/evals/how-to/logfire-integration), analyses are automatically attached to the experiment span as the `logfire.experiment.analyses` attribute. The Logfire UI renders them as interactive visualizations:

-   **Confusion matrices** are displayed as heatmaps
-   **Precision-recall curves** are rendered as line charts with AUC in the legend
-   **Line plots** (ROC curves, KS plots, etc.) are rendered as line charts with configurable axes
-   **Scalar results** are shown as labeled values
-   **Tables** are rendered as formatted data tables

When comparing multiple experiments in the Logfire Evals view, analyses of the same type are displayed side by side for easy comparison.

## Complete Example

A full example combining case-level evaluators with report evaluators:

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    ConfusionMatrixEvaluator,
    Evaluator,
    EvaluatorContext,
    KolmogorovSmirnovEvaluator,
    PrecisionRecallEvaluator,
    ReportEvaluator,
    ReportEvaluatorContext,
    ROCAUCEvaluator,
)
from pydantic_evals.reporting.analyses import ScalarResult


def my_classifier(text: str) -> str:
    text = text.lower()
    if 'cat' in text or 'meow' in text:
        return 'cat'
    elif 'dog' in text or 'bark' in text:
        return 'dog'
    elif 'bird' in text or 'chirp' in text:
        return 'bird'
    return 'unknown'


# Case-level evaluator: runs per case
@dataclass
class ConfidenceEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, Any]:
        confidence = compute_confidence(ctx.output, ctx.inputs)
        is_correct = ctx.output == ctx.expected_output
        return {
            'confidence': confidence,
            'is_correct': is_correct,
        }


def compute_confidence(output: str, inputs: str) -> float:
    return 0.85  # placeholder


# Report-level evaluator: runs once over the full report
@dataclass
class AccuracyEvaluator(ReportEvaluator):
    def evaluate(self, ctx: ReportEvaluatorContext) -> ScalarResult:
        cases = ctx.report.cases
        correct = sum(1 for c in cases if c.output == c.expected_output)
        return ScalarResult(
            title='Accuracy',
            value=correct / len(cases) * 100 if cases else 0.0,
            unit='%',
        )


dataset = Dataset(
    name='full_example',
    cases=[
        Case(inputs='The cat meows', expected_output='cat'),
        Case(inputs='The dog barks', expected_output='dog'),
        Case(inputs='A bird chirps', expected_output='bird'),
    ],
    evaluators=[ConfidenceEvaluator()],
    report_evaluators=[
        ConfusionMatrixEvaluator(
            predicted_from='output',
            expected_from='expected_output',
            title='Animal Classification',
        ),
        PrecisionRecallEvaluator(
            score_from='scores',
            score_key='confidence',
            positive_from='assertions',
            positive_key='is_correct',
        ),
        ROCAUCEvaluator(
            score_from='scores',
            score_key='confidence',
            positive_from='assertions',
            positive_key='is_correct',
        ),
        KolmogorovSmirnovEvaluator(
            score_from='scores',
            score_key='confidence',
            positive_from='assertions',
            positive_key='is_correct',
        ),
        AccuracyEvaluator(),
    ],
)

report = dataset.evaluate_sync(my_classifier)

# Access analyses programmatically
for analysis in report.analyses:
    print(f'{analysis.type}: {analysis.title}')
    #> confusion_matrix: Animal Classification
    #> precision_recall: Precision-Recall Curve
    #> scalar: Precision-Recall Curve AUC
    #> line_plot: ROC Curve
    #> scalar: ROC Curve AUC
    #> line_plot: KS Plot
    #> scalar: KS Statistic
    #> scalar: Accuracy
```

## Next Steps

-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** -- Case-level evaluator reference
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Writing case-level evaluators
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Viewing analyses in the Logfire UI

---

# [Span-Based](https://pydantic.dev/docs/ai/evals/evaluators/span-based/)

# Span-Based

Evaluate AI system behavior by analyzing OpenTelemetry spans captured during execution.

Requires Logfire

Span-based evaluation requires `logfire` to be installed and configured:

Terminal

```bash
pip install 'pydantic-evals[logfire]'
```

Span-based evaluation enables you to evaluate **how** your AI system executes, not just **what** it produces. This is essential for complex agents where ensuring the desired behavior depends on the execution path taken, not just the final output.

### Why Span-Based Evaluation?

Traditional evaluators assess task inputs and outputs. For simple tasks, this may be sufficient--if the output is correct, the task succeeded. But for complex multi-step agents, the _process_ matters as much as the result:

-   **A correct answer reached incorrectly** - An agent might produce the right output by accident (e.g., guessing, using cached data when it should have searched, calling the wrong tools but getting lucky)
-   **Verification of required behaviors** - You need to ensure specific tools were called, certain code paths executed, or particular patterns followed
-   **Performance and efficiency** - The agent should reach the answer efficiently, without unnecessary tool calls, infinite loops, or excessive retries
-   **Safety and compliance** - Critical to verify that dangerous operations weren't attempted, sensitive data wasn't accessed inappropriately, or guardrails weren't bypassed

### Real-World Scenarios

Span-based evaluation is particularly valuable for:

-   **RAG systems** - Verify documents were retrieved and reranked before generation, not just that the answer included citations
-   **Multi-agent coordination** - Ensure the orchestrator delegated to the right specialist agents in the correct order
-   **Tool-calling agents** - Confirm specific tools were used (or avoided), and in the expected sequence
-   **Debugging and regression testing** - Catch behavioral regressions where outputs remain correct but the internal logic deteriorates
-   **Production alignment** - Ensure your evaluation assertions operate on the same telemetry data captured in production, so eval insights directly translate to production monitoring

### How It Works

When you configure logfire (`logfire.configure()`), Pydantic Evals captures all OpenTelemetry spans generated during task execution. You can then write evaluators that assert conditions on:

-   **Which tools were called** - `HasMatchingSpan(query={'name_contains': 'search_tool'})`
-   **Code paths executed** - Verify specific functions ran or particular branches taken
-   **Timing characteristics** - Check that operations complete within SLA bounds
-   **Error conditions** - Detect retries, fallbacks, or specific failure modes
-   **Execution structure** - Verify parent-child relationships, delegation patterns, or execution order

This creates a fundamentally different evaluation paradigm: you're testing behavioral contracts, not just input-output relationships.

## Basic Usage

```python
import logfire

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import HasMatchingSpan

# Configure logfire to capture spans
logfire.configure(send_to_logfire='if-token-present')

dataset = Dataset(
    name='span_basic',
    cases=[Case(inputs='test')],
    evaluators=[
        # Check that database was queried
        HasMatchingSpan(
            query={'name_contains': 'database_query'},
            evaluation_name='used_database',
        ),
    ],
)
```

## HasMatchingSpan Evaluator

The [`HasMatchingSpan`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.HasMatchingSpan) evaluator checks if any span matches a query:

```python
from pydantic_evals.evaluators import HasMatchingSpan

HasMatchingSpan(
    query={'name_contains': 'test'},
    evaluation_name='span_check',
)
```

**Returns:** `bool` - `True` if any span matches the query

## SpanQuery Reference

A [`SpanQuery`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanQuery) is a dictionary with query conditions:

### Name Conditions

Match spans by name:

```python
# Exact name match
{'name_equals': 'search_database'}

# Contains substring
{'name_contains': 'tool_call'}

# Regex pattern
{'name_matches_regex': r'llm_call_\d+'}
```

### Attribute Conditions

Match spans with specific attributes:

```python
# Has specific attribute values
{'has_attributes': {'operation': 'search', 'status': 'success'}}

# Has attribute keys (any value)
{'has_attribute_keys': ['user_id', 'request_id']}
```

### Duration Conditions

Match based on execution time:

```python
from datetime import timedelta

# Minimum duration
{'min_duration': 1.0}  # seconds
{'min_duration': timedelta(seconds=1)}

# Maximum duration
{'max_duration': 5.0}  # seconds
{'max_duration': timedelta(seconds=5)}

# Range
{'min_duration': 0.5, 'max_duration': 2.0}
```

### Logical Operators

Combine conditions:

```python
# NOT
{'not_': {'name_contains': 'error'}}

# AND (all must match)
{'and_': [
    {'name_contains': 'tool'},
    {'max_duration': 1.0},
]}

# OR (any must match)
{'or_': [
    {'name_equals': 'search'},
    {'name_equals': 'query'},
]}
```

### Child/Descendant Conditions

Query relationships between spans:

```python
# Count direct children
{'min_child_count': 1}
{'max_child_count': 5}

# Some child matches query
{'some_child_has': {'name_contains': 'retry'}}

# All children match query
{'all_children_have': {'max_duration': 0.5}}

# No children match query
{'no_child_has': {'has_attributes': {'error': True}}}

# Descendant queries (recursive)
{'min_descendant_count': 5}
{'some_descendant_has': {'name_contains': 'api_call'}}
```

### Ancestor/Depth Conditions

Query span hierarchy:

```python
# Depth (root spans have depth 0)
{'min_depth': 1}  # Not a root span
{'max_depth': 2}  # At most 2 levels deep

# Ancestor queries
{'some_ancestor_has': {'name_equals': 'agent_run'}}
{'all_ancestors_have': {'max_duration': 10.0}}
{'no_ancestor_has': {'has_attributes': {'error': True}}}
```

### Stop Recursing

Control recursive queries:

```python
{
    'some_descendant_has': {'name_contains': 'expensive'},
    'stop_recursing_when': {'name_equals': 'boundary'},
}
# Only search descendants until hitting a span named 'boundary'
```

## Practical Examples

### Verify Tool Usage

Check that specific tools were called:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import HasMatchingSpan

dataset = Dataset(
    name='tool_verification',
    cases=[Case(inputs='test')],
    evaluators=[
        # Must call search tool
        HasMatchingSpan(
            query={'name_contains': 'search_tool'},
            evaluation_name='used_search',
        ),

        # Must NOT call dangerous tool
        HasMatchingSpan(
            query={'not_': {'name_contains': 'delete_database'}},
            evaluation_name='safe_execution',
        ),
    ],
)
```

### Check Multiple Tools

Verify a sequence of operations:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    HasMatchingSpan(
        query={'name_contains': 'retrieve_context'},
        evaluation_name='retrieved_context',
    ),
    HasMatchingSpan(
        query={'name_contains': 'generate_response'},
        evaluation_name='generated_response',
    ),
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'cite'},
            {'has_attribute_keys': ['source_id']},
        ]},
        evaluation_name='added_citations',
    ),
]
```

### Performance Assertions

Ensure operations meet latency requirements:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # Database queries should be fast
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'database'},
            {'max_duration': 0.1},  # 100ms max
        ]},
        evaluation_name='fast_db_queries',
    ),

    # Overall should complete quickly
    HasMatchingSpan(
        query={'and_': [
            {'name_equals': 'task_execution'},
            {'max_duration': 2.0},
        ]},
        evaluation_name='within_sla',
    ),
]
```

### Error Detection

Check for error conditions:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # No errors occurred
    HasMatchingSpan(
        query={'not_': {'has_attributes': {'error': True}}},
        evaluation_name='no_errors',
    ),

    # Retries happened
    HasMatchingSpan(
        query={'name_contains': 'retry'},
        evaluation_name='had_retries',
    ),

    # Fallback was used
    HasMatchingSpan(
        query={'name_contains': 'fallback_model'},
        evaluation_name='used_fallback',
    ),
]
```

### Complex Behavioral Checks

Verify sophisticated behavior patterns:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # Agent delegated to sub-agent
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'agent'},
            {'some_child_has': {'name_contains': 'delegate'}},
        ]},
        evaluation_name='used_delegation',
    ),

    # Made multiple LLM calls with retries
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'llm_call'},
            {'some_descendant_has': {'name_contains': 'retry'}},
            {'min_descendant_count': 3},
        ]},
        evaluation_name='retry_pattern',
    ),
]
```

## Custom Evaluators with SpanTree

For more complex span analysis, write custom evaluators:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CustomSpanCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | int]:
        span_tree = ctx.span_tree

        # Find specific spans
        llm_spans = span_tree.find(lambda node: 'llm' in node.name)
        tool_spans = span_tree.find(lambda node: 'tool' in node.name)

        # Calculate metrics
        total_llm_time = sum(
            span.duration.total_seconds() for span in llm_spans
        )

        return {
            'used_llm': len(llm_spans) > 0,
            'used_tools': len(tool_spans) > 0,
            'tool_count': len(tool_spans),
            'llm_fast': total_llm_time < 2.0,
        }
```

### SpanTree API

The [`SpanTree`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanTree) provides methods for span analysis:

```python
from pydantic_evals.otel import SpanTree


# Example API (requires span_tree from context)
def example_api(span_tree: SpanTree) -> None:
    span_tree.find(lambda n: True)  # Find all matching nodes
    span_tree.any({'name_contains': 'test'})  # Check if any span matches
    span_tree.all({'name_contains': 'test'})  # Check if all spans match
    span_tree.count({'name_contains': 'test'})  # Count matching spans

    # Iteration
    for node in span_tree:
        print(node.name, node.duration, node.attributes)
```

### SpanNode Properties

Each [`SpanNode`](/docs/ai/api/pydantic_evals/otel/#pydantic_evals.otel.SpanNode) has:

```python
from pydantic_evals.otel import SpanNode


# Example properties (requires node from context)
def example_properties(node: SpanNode) -> None:
    _ = node.name  # Span name
    _ = node.duration  # timedelta
    _ = node.attributes  # dict[str, AttributeValue]
    _ = node.start_timestamp  # datetime
    _ = node.end_timestamp  # datetime
    _ = node.children  # list[SpanNode]
    _ = node.descendants  # list[SpanNode] (recursive)
    _ = node.ancestors  # list[SpanNode]
    _ = node.parent  # SpanNode | None
```

## Debugging Span Queries

### View Spans in Logfire

If you're sending data to Logfire, you can view all spans in the web UI to understand the trace structure.

### Print Span Tree

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class DebugSpans(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        for node in ctx.span_tree:
            print(f"{'  ' * len(node.ancestors)}{node.name} ({node.duration})")
        return True
```

### Query Testing

Test queries incrementally:

```python
from pydantic_evals.evaluators import HasMatchingSpan

# Start simple
query = {'name_contains': 'tool'}

# Add conditions gradually
query = {'and_': [
    {'name_contains': 'tool'},
    {'max_duration': 1.0},
]}

# Test in evaluator
HasMatchingSpan(query=query, evaluation_name='test')
```

## Use Cases

### RAG System Verification

Verify retrieval-augmented generation workflow:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # Retrieved documents
    HasMatchingSpan(
        query={'name_contains': 'vector_search'},
        evaluation_name='retrieved_docs',
    ),

    # Reranked results
    HasMatchingSpan(
        query={'name_contains': 'rerank'},
        evaluation_name='reranked_results',
    ),

    # Generated with context
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'generate'},
            {'has_attribute_keys': ['context_ids']},
        ]},
        evaluation_name='used_context',
    ),
]
```

### Multi-Agent Systems

Verify agent coordination:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # Master agent ran
    HasMatchingSpan(
        query={'name_equals': 'master_agent'},
        evaluation_name='master_ran',
    ),

    # Delegated to specialist
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'specialist_agent'},
            {'some_ancestor_has': {'name_equals': 'master_agent'}},
        ]},
        evaluation_name='delegated_correctly',
    ),

    # No circular delegation
    HasMatchingSpan(
        query={'not_': {'and_': [
            {'name_contains': 'agent'},
            {'some_descendant_has': {'name_contains': 'agent'}},
            {'some_ancestor_has': {'name_contains': 'agent'}},
        ]}},
        evaluation_name='no_circular_delegation',
    ),
]
```

### Tool Usage Patterns

Verify intelligent tool selection:

```python
from pydantic_evals.evaluators import HasMatchingSpan

evaluators = [
    # Used search before answering
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'search'},
            {'some_ancestor_has': {'name_contains': 'answer'}},
        ]},
        evaluation_name='searched_before_answering',
    ),

    # Limited tool calls (no loops)
    HasMatchingSpan(
        query={'and_': [
            {'name_contains': 'tool'},
            {'max_child_count': 5},
        ]},
        evaluation_name='reasonable_tool_usage',
    ),
]
```

## Best Practices

1.  **Start Simple**: Begin with basic name queries, add complexity as needed
2.  **Use Descriptive Names**: Name your spans well in your application code
3.  **Test Queries**: Verify queries work before running full evaluations
4.  **Combine with Other Evaluators**: Use span checks alongside output validation
5.  **Document Expectations**: Comment why specific spans should/shouldn't exist

## Next Steps

-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - Set up Logfire for span capture
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write advanced span analysis
-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Other evaluator types

---

# [Simple Validation](https://pydantic.dev/docs/ai/evals/examples/simple-validation/)

# Simple Validation

A proof of concept example of evaluating a simple text transformation function with deterministic checks.

## Scenario

We're testing a function that converts text to title case. We want to verify:

-   Output is always a string
-   Output matches expected format
-   Function handles edge cases correctly
-   Performance meets requirements

## Complete Example

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import (
    Contains,
    EqualsExpected,
    IsInstance,
    MaxDuration,
)


# The function we're testing
def to_title_case(text: str) -> str:
    """Convert text to title case."""
    return text.title()


# Create evaluation dataset
dataset = Dataset(
    name='title_case_validation',
    cases=[
        # Basic functionality
        Case(
            name='basic_lowercase',
            inputs='hello world',
            expected_output='Hello World',
        ),
        Case(
            name='basic_uppercase',
            inputs='HELLO WORLD',
            expected_output='Hello World',
        ),
        Case(
            name='mixed_case',
            inputs='HeLLo WoRLd',
            expected_output='Hello World',
        ),

        # Edge cases
        Case(
            name='empty_string',
            inputs='',
            expected_output='',
        ),
        Case(
            name='single_word',
            inputs='hello',
            expected_output='Hello',
        ),
        Case(
            name='with_punctuation',
            inputs='hello, world!',
            expected_output='Hello, World!',
        ),
        Case(
            name='with_numbers',
            inputs='hello 123 world',
            expected_output='Hello 123 World',
        ),
        Case(
            name='apostrophes',
            inputs="don't stop believin'",
            expected_output="Don'T Stop Believin'",
        ),
    ],
    evaluators=[
        # Always returns a string
        IsInstance(type_name='str'),

        # Matches expected output
        EqualsExpected(),

        # Output should contain capital letters
        Contains(value='H', evaluation_name='has_capitals'),

        # Should be fast (under 1ms)
        MaxDuration(seconds=0.001),
    ],
)


# Run evaluation
if __name__ == '__main__':
    report = dataset.evaluate_sync(to_title_case)

    # Print results
    report.print(include_input=True, include_output=True)
"""
                            Evaluation Summary: to_title_case
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID          ┃ Inputs               ┃ Outputs              ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ basic_lowercase  │ hello world          │ Hello World          │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ basic_uppercase  │ HELLO WORLD          │ Hello World          │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ mixed_case       │ HeLLo WoRLd          │ Hello World          │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ empty_string     │ -                    │ -                    │ ✔✔✗✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ single_word      │ hello                │ Hello                │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ with_punctuation │ hello, world!        │ Hello, World!        │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ with_numbers     │ hello 123 world      │ Hello 123 World      │ ✔✔✔✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ apostrophes      │ don't stop believin' │ Don'T Stop Believin' │ ✔✔✗✗       │     10ms │
├──────────────────┼──────────────────────┼──────────────────────┼────────────┼──────────┤
│ Averages         │                      │                      │ 68.8% ✔    │     10ms │
└──────────────────┴──────────────────────┴──────────────────────┴────────────┴──────────┘
"""
# Check if all passed
avg = report.averages()
if avg and avg.assertions == 1.0:
    print('\n✅ All tests passed!')
else:
    print(f'\n❌ Some tests failed (pass rate: {avg.assertions:.1%})')
    """
    ❌ Some tests failed (pass rate: 68.8%)
    """
```

## Expected Output

```
                        Evaluation Summary: to_title_case
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID           ┃ Inputs               ┃ Outputs               ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ basic_lowercase   │ hello world          │ Hello World           │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ basic_uppercase   │ HELLO WORLD          │ Hello World           │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ mixed_case        │ HeLLo WoRLd          │ Hello World           │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ empty_string      │                      │                       │ ✔✔✗✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ single_word       │ hello                │ Hello                 │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ with_punctuation  │ hello, world!        │ Hello, World!         │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ with_numbers      │ hello 123 world      │ Hello 123 World       │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ apostrophes       │ don't stop believin' │ Don'T Stop Believin'  │ ✔✔✔✔       │      <1ms│
├───────────────────┼──────────────────────┼───────────────────────┼────────────┼──────────┤
│ Averages          │                      │                       │ 96.9% ✔    │      <1ms│
└───────────────────┴──────────────────────┴───────────────────────┴────────────┴──────────┘

✅ All tests passed!
```

Note: The `empty_string` case has one failed assertion (`has_capitals`) because an empty string contains no capital letters.

## Saving and Loading

Save the dataset for future use:

```python
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected


# The function we're testing
def to_title_case(text: str) -> str:
    """Convert text to title case."""
    return text.title()


# Create dataset
dataset: Dataset[str, str, Any] = Dataset(
    name='title_case_tests',
    cases=[Case(inputs='test', expected_output='Test')],
    evaluators=[EqualsExpected()],
)

# Save to YAML
dataset.to_file('title_case_tests.yaml')

# Load later
dataset = Dataset.from_file('title_case_tests.yaml')
report = dataset.evaluate_sync(to_title_case)
```

## Adding More Cases

As you find bugs or edge cases, add them to the dataset:

```python
from pydantic_evals import Dataset

# Load existing dataset
dataset = Dataset.from_file('title_case_tests.yaml')

# Found a bug with unicode
dataset.add_case(
    name='unicode_chars',
    inputs='café résumé',
    expected_output='Café Résumé',
)

# Found a bug with all caps words
dataset.add_case(
    name='acronyms',
    inputs='the USA and FBI',
    expected_output='The Usa And Fbi',  # Python's title() behavior
)

# Test with very long input
dataset.add_case(
    name='long_input',
    inputs=' '.join(['word'] * 1000),
    expected_output=' '.join(['Word'] * 1000),
)

# Save updated dataset
dataset.to_file('title_case_tests.yaml')
```

## Using with pytest

Integrate with pytest for CI/CD:

```python
import pytest

from pydantic_evals import Dataset


# The function we're testing
def to_title_case(text: str) -> str:
    """Convert text to title case."""
    return text.title()


@pytest.fixture
def title_case_dataset():
    return Dataset.from_file('title_case_tests.yaml')


def test_title_case_evaluation(title_case_dataset):
    """Run evaluation tests."""
    report = title_case_dataset.evaluate_sync(to_title_case)

    # All cases should pass
    avg = report.averages()
    assert avg is not None
    assert avg.assertions == 1.0, f'Some tests failed (pass rate: {avg.assertions:.1%})'


def test_title_case_performance(title_case_dataset):
    """Verify performance."""
    report = title_case_dataset.evaluate_sync(to_title_case)

    # All cases should complete quickly
    for case in report.cases:
        assert case.task_duration < 0.001, f'{case.name} took {case.task_duration}s'
```

## Next Steps

-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Explore all available evaluators
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic
-   **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and manage datasets
-   **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance

---

# [Core Concepts](https://pydantic.dev/docs/ai/evals/getting-started/core-concepts/)

# Core Concepts

This page explains the key concepts in Pydantic Evals and how they work together.

Pydantic Evals is built around these core concepts:

-   **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)** - A static definition containing test cases and evaluators
-   **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)** - A single test scenario with inputs and optional expected outputs
-   **[`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)** - Logic for scoring or validating individual outputs
-   **[`ReportEvaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.ReportEvaluator)** - Logic for analyzing full experiment results (e.g., confusion matrices, accuracy)
-   **Experiment** - The act of running a task function against all cases in a dataset. (This corresponds to a call to `Dataset.evaluate`.)
-   **[`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport)** - The results from running an experiment

The key distinction is between:

-   **Definition** (`Dataset` with `Case`s, `Evaluator`s, and `ReportEvaluator`s) - what you want to test
-   **Execution** (Experiment) - running your task against those tests
-   **Results** (`EvaluationReport` with case results and experiment-wide analyses) - what happened during the experiment

## Unit Testing Analogy

A helpful way to think about Pydantic Evals:

Unit Testing

Pydantic Evals

Test function

[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) + [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)

Test suite

[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)

Running tests (`pytest`)

**Experiment** (`dataset.evaluate(task)`)

Test report

[`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport)

`assert`

Evaluator returning `bool`

**Key Difference**: AI systems are probabilistic, so instead of simple pass/fail, evaluations can have:

-   Quantitative scores (0.0 to 1.0)
-   Qualitative labels ("good", "acceptable", "poor")
-   Pass/fail assertions with explanatory reasons

Just like you can run `pytest` multiple times on the same test suite, you can run multiple experiments on the same dataset to compare different implementations or track changes over time.

## Dataset

A [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) is a collection of test cases and evaluators that define an evaluation suite.

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import IsInstance

dataset = Dataset(
    name='my_eval_suite',
    cases=[
        Case(inputs='test input', expected_output='test output'),
    ],
    evaluators=[
        IsInstance(type_name='str'),
    ],
)
```

### Key Features

-   **Type-safe**: Generic over `InputsT`, `OutputT`, and `MetadataT` types
-   **Serializable**: Can be saved to/loaded from YAML or JSON files
-   **Evaluable**: Run against any function with matching input/output types

### `Dataset`\-Level vs `Case`\-Level Evaluators

Evaluators can be defined at two levels:

-   **`Dataset`\-level**: Apply to all cases in the dataset
-   **`Case`\-level**: Apply only to specific cases

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected, IsInstance

dataset = Dataset(
    name='case_level_evaluators',
    cases=[
        Case(
            name='special_case',
            inputs='test',
            expected_output='TEST',
            evaluators=[
                # This evaluator only runs for this case
                EqualsExpected(),
            ],
        ),
    ],
    evaluators=[
        # This evaluator runs for ALL cases
        IsInstance(type_name='str'),
    ],
)
```

## Experiments

An **Experiment** is what happens when you execute a task function against all cases in a dataset. This is the bridge between your static test definition (the Dataset) and your results (the EvaluationReport).

### Running an Experiment

You run an experiment by calling [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) or [`evaluate_sync()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync) on a dataset:

```python
from pydantic_evals import Case, Dataset

# Define your dataset (static definition)
dataset = Dataset(
    name='uppercase_experiment',
    cases=[
        Case(inputs='hello', expected_output='HELLO'),
        Case(inputs='world', expected_output='WORLD'),
    ],
)

# Define your task
def uppercase_task(text: str) -> str:
    return text.upper()

# Run the experiment (execution)
report = dataset.evaluate_sync(uppercase_task)
```

### What Happens During an Experiment

When you run an experiment:

1.  **Setup**: The dataset loads all cases, evaluators, and report evaluators
2.  **Execution**: For each case:
    1.  The task function is called with `case.inputs`
    2.  Execution time is measured and OpenTelemetry spans are captured (if `logfire` is configured)
    3.  The outputs of the task function for each case are recorded
3.  **Case Evaluation**: For each case output:
    1.  All dataset-level evaluators are run
    2.  Case-specific evaluators are run (if any)
    3.  Results are collected (scores, assertions, labels)
4.  **Report Evaluation**: If report evaluators are configured, they run over the full set of results to produce experiment-wide analyses (confusion matrices, precision-recall curves, scalar metrics, tables, etc.)
5.  **Reporting**: All results are aggregated into an [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport), including both per-case results and experiment-wide analyses

### Multiple Experiments from One Dataset

A key feature of Pydantic Evals is that you can run the same dataset against different task implementations:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

dataset = Dataset(
    name='comparison_test',
    cases=[
        Case(inputs='hello', expected_output='HELLO'),
    ],
    evaluators=[EqualsExpected()],
)


# Original implementation
def task_v1(text: str) -> str:
    return text.upper()


# Improved implementation (with exclamation)
def task_v2(text: str) -> str:
    return text.upper() + '!'


# Compare results
report_v1 = dataset.evaluate_sync(task_v1)
report_v2 = dataset.evaluate_sync(task_v2)

avg_v1 = report_v1.averages()
avg_v2 = report_v2.averages()
print(f'V1 pass rate: {avg_v1.assertions if avg_v1 and avg_v1.assertions else 0}')
#> V1 pass rate: 1.0
print(f'V2 pass rate: {avg_v2.assertions if avg_v2 and avg_v2.assertions else 0}')
#> V2 pass rate: 0
```

This allows you to:

-   **Compare implementations** across versions
-   **Track performance** over time
-   **A/B test** different approaches
-   **Validate changes** before deployment

## Case

A [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) represents a single test scenario with specific inputs and optional expected outputs.

```python
from pydantic_evals import Case
from pydantic_evals.evaluators import EqualsExpected

case = Case(
    name='test_uppercase',  # Optional, but recommended for reporting
    inputs='hello world',  # Required: inputs to your task
    expected_output='HELLO WORLD',  # Optional: expected output
    metadata={'category': 'basic'},  # Optional: arbitrary metadata
    evaluators=[EqualsExpected()],  # Optional: case-specific evaluators
)
```

### Case Components

#### Inputs

The inputs to pass to the task being evaluated. Can be any type:

```python
from pydantic import BaseModel

from pydantic_evals import Case


class MyInputModel(BaseModel):
    field1: str


# Simple types
Case(inputs='hello')
Case(inputs=42)

# Complex types
Case(inputs={'query': 'What is AI?', 'max_tokens': 100})
Case(inputs=MyInputModel(field1='value'))
```

#### Expected Output

The expected result, used by evaluators like [`EqualsExpected`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected):

```python
from pydantic_evals import Case

Case(
    inputs='2 + 2',
    expected_output='4',
)
```

If no `expected_output` is provided, evaluators that require it (like `EqualsExpected`) will skip that case.

#### Metadata

Arbitrary data that evaluators can access via [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext):

```python
from pydantic_evals import Case

Case(
    inputs='question',
    metadata={
        'difficulty': 'hard',
        'category': 'math',
        'source': 'exam_2024',
    },
)
```

Metadata is useful for:

-   Filtering cases during analysis
-   Providing context to evaluators
-   Organizing test suites

#### Evaluators

Cases can have their own evaluators that only run for that specific case. This is particularly powerful for building comprehensive evaluation suites where different cases have different requirements - if you could write one evaluator rubric that worked perfectly for all cases, you'd just incorporate it into your agent instructions. Case-specific [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators are especially useful for quickly building maintainable golden datasets by describing what "good" looks like for each scenario. See [Case-specific evaluators](/docs/ai/evals/evaluators/overview#case-specific-evaluators) for a more detailed explanation and examples.

## Evaluator

An [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) assesses the output of your task and returns one or more scores, labels, or assertions. Each score, label or assertion can also have an optional string-value reason associated.

### Evaluator Types

Evaluators return different types of results:

Return Type

Purpose

Example

`bool`

**Assertion** - Pass/fail check

`True` → ✔, `False` → ✗

`int` or `float`

**Score** - Numeric quality metric

`0.95`, `87`

`str`

**Label** - Categorical result

`"correct"`, `"hallucination"`

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ExactMatch(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.output == ctx.expected_output  # Assertion


@dataclass
class Confidence(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> float:
        # Analyze output and return confidence score
        return 0.95  # Score


@dataclass
class Classifier(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> str:
        if 'error' in ctx.output.lower():
            return 'error'  # Label
        return 'success'
```

Evaluators can also return instances of [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason), and dictionaries mapping labels to output values. See the [custom evaluator return types](/docs/ai/evals/evaluators/custom#return-types) docs for more detail.

### EvaluatorContext

All evaluators receive an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) containing:

-   `name`: Case name (optional)
-   `inputs`: Task inputs
-   `metadata`: Case metadata (optional)
-   `expected_output`: Expected output (optional)
-   `output`: Actual output from task
-   `duration`: Task execution time in seconds
-   `span_tree`: OpenTelemetry spans (if `logfire` is configured)
-   `attributes`: Custom attributes dict
-   `metrics`: Custom metrics dict

### Multiple Evaluations

Evaluators can return multiple results by returning a dictionary:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class MultiCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float | str]:
        return {
            'is_valid': isinstance(ctx.output, str),  # Assertion
            'length': len(ctx.output),  # Metric
            'category': 'long' if len(ctx.output) > 100 else 'short',  # Label
        }
```

### Evaluation Reasons

Add explanations to your evaluations using [`EvaluationReason`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationReason):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import EvaluationReason, Evaluator, EvaluatorContext


@dataclass
class SmartCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> EvaluationReason:
        if ctx.output == ctx.expected_output:
            return EvaluationReason(
                value=True,
                reason='Exact match with expected output',
            )
        return EvaluationReason(
            value=False,
            reason=f'Expected {ctx.expected_output!r}, got {ctx.output!r}',
        )
```

Reasons appear in reports when using `include_reasons=True`.

## Evaluation Report

An [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) is the result of running an experiment. It contains all the data from executing your task against the dataset's cases and running all evaluators.

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

dataset = Dataset(
    name='report_example',
    cases=[Case(inputs='hello', expected_output='HELLO')],
    evaluators=[EqualsExpected()],
)


def my_task(text: str) -> str:
    return text.upper()


# Run an experiment
report = dataset.evaluate_sync(my_task)

# Print to console
report.print()
"""
    Evaluation Summary: my_task
┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID  ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Case 1   │ ✔          │     10ms │
├──────────┼────────────┼──────────┤
│ Averages │ 100.0% ✔   │     10ms │
└──────────┴────────────┴──────────┘
"""

# Access data programmatically
for case in report.cases:
    print(f'{case.name}: {case.scores}')
    #> Case 1: {}
```

### Report Structure

The [`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport) contains:

-   `name`: Experiment name
-   `cases`: List of successful case evaluations
-   `failures`: List of failed executions
-   `analyses`: List of experiment-wide analyses from [report evaluators](/docs/ai/evals/evaluators/report-evaluators) (confusion matrices, PR curves, scalars, tables)
-   `trace_id`: OpenTelemetry trace ID (optional)
-   `span_id`: OpenTelemetry span ID (optional)

### ReportCase

Each successfulcase result contains:

**Case data:**

-   `name`: Case name
-   `inputs`: Task inputs
-   `metadata`: Case metadata (optional)
-   `expected_output`: Expected output (optional)
-   `output`: Actual output from task

**Evaluation results:**

-   `scores`: Dictionary of numeric scores from evaluators
-   `labels`: Dictionary of categorical labels from evaluators
-   `assertions`: Dictionary of pass/fail assertions from evaluators

**Performance data:**

-   `task_duration`: Task execution time
-   `total_duration`: Total time including evaluators

**Additional data:**

-   `metrics`: Custom metrics dict
-   `attributes`: Custom attributes dict

**Tracing:**

-   `trace_id`: OpenTelemetry trace ID (optional)
-   `span_id`: OpenTelemetry span ID (optional)

**Errors:**

-   `evaluator_failures`: List of evaluator errors

## Data Model Relationships

Here's how the core concepts relate to each other:

### Static Definition

-   A **Dataset** contains:
    -   Many **Cases** (test scenarios with inputs and expected outputs)
    -   Many **Evaluators** (logic for scoring individual outputs)
    -   Many **Report Evaluators** (logic for analyzing full experiment results)

### Execution (Experiment)

When you call `dataset.evaluate(task)`, an **Experiment** runs:

-   The **Task** function is executed against all **Cases** in the **Dataset**
-   All **Evaluators** are run (both dataset-level and case-specific) against each output as appropriate
-   One **EvaluationReport** is produced as the final output

### Results

-   An **EvaluationReport** contains:
    -   Results for each **Case** (inputs, outputs, scores, assertions, labels)
    -   Experiment-wide **Analyses** from report evaluators (confusion matrices, PR curves, scalars, tables)
    -   Summary statistics (averages, pass rates)
    -   Performance data (durations)
    -   Tracing information (OpenTelemetry spans)

### Key Relationships

-   **One Dataset → Many Experiments**: You can run the same dataset against different task implementations or multiple times to track changes
-   **One Experiment → One Report**: Each time you call `dataset.evaluate(...)`, you get one report
-   **One Experiment → Many Case Results**: The report contains results for every case in the dataset

## Next Steps

-   **[Evaluators Overview](/docs/ai/evals/evaluators/overview)** - When to use different evaluator types
-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Complete reference of provided evaluators
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic
-   **[Report Evaluators](/docs/ai/evals/evaluators/report-evaluators)** - Experiment-wide analyses (confusion matrices, PR curves, etc.)
-   **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and generate datasets

---

# [Quick Start](https://pydantic.dev/docs/ai/evals/getting-started/quick-start/)

# Quick Start

**Pydantic Evals** is a powerful evaluation framework for systematically testing and evaluating AI systems, from simple LLM calls to complex multi-agent applications.

## What is Pydantic Evals?

Pydantic Evals helps you:

-   **Create test datasets** with type-safe structured inputs and expected outputs
-   **Run evaluations** against your AI systems with automatic concurrency
-   **Score results** using deterministic checks, LLM judges, or custom evaluators
-   **Generate reports** with detailed metrics, assertions, and performance data
-   **Track changes** by comparing evaluation runs over time
-   **Integrate with Logfire** for visualization and collaborative analysis

## Installation

Terminal

```bash
pip install pydantic-evals
```

For OpenTelemetry tracing and Logfire integration:

Terminal

```bash
pip install 'pydantic-evals[logfire]'
```

## Quick Start

While evaluations are typically used to test AI systems, the Pydantic Evals framework works with any function call. To demonstrate the core functionality, we'll start with a simple, deterministic example.

Here's a complete example of evaluating a simple text transformation function:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Contains, EqualsExpected

# Create a dataset with test cases
dataset = Dataset(
    name='uppercase_tests',
    cases=[
        Case(
            name='uppercase_basic',
            inputs='hello world',
            expected_output='HELLO WORLD',
        ),
        Case(
            name='uppercase_with_numbers',
            inputs='hello 123',
            expected_output='HELLO 123',
        ),
    ],
    evaluators=[
        EqualsExpected(),  # Check exact match with expected_output
        Contains(value='HELLO', case_sensitive=True),  # Check contains "HELLO"
    ],
)


# Define the function to evaluate
def uppercase_text(text: str) -> str:
    return text.upper()


# Run the evaluation
report = dataset.evaluate_sync(uppercase_text)

# Print the results
report.print()
"""
        Evaluation Summary: uppercase_text
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID                ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ uppercase_basic        │ ✔✔         │     10ms │
├────────────────────────┼────────────┼──────────┤
│ uppercase_with_numbers │ ✔✔         │     10ms │
├────────────────────────┼────────────┼──────────┤
│ Averages               │ 100.0% ✔   │     10ms │
└────────────────────────┴────────────┴──────────┘
"""
```

Output:

```
                  Evaluation Summary: uppercase_text
┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID                 ┃ Assertions ┃ Duration ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ uppercase_basic         │ ✔✔         │     10ms │
├─────────────────────────┼────────────┼──────────┤
│ uppercase_with_numbers  │ ✔✔         │     10ms │
├─────────────────────────┼────────────┼──────────┤
│ Averages                │ 100.0% ✔   │     10ms │
└─────────────────────────┴────────────┴──────────┘
```

## Key Concepts

Understanding a few core concepts will help you get the most out of Pydantic Evals:

-   **[`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset)** - A collection of test cases and (optional) evaluators
-   **[`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case)** - A single test scenario with inputs and optional expected outputs and case-specific evaluators
-   **[`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator)** - A function that scores or validates task outputs
-   **[`EvaluationReport`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport)** - Results from running an evaluation

For a deeper dive, see [Core Concepts](/docs/ai/evals/core-concepts).

## Common Use Cases

### Deterministic Validation

Test that your AI system produces correctly-structured outputs:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Contains, IsInstance

dataset = Dataset(
    name='dict_validation',
    cases=[
        Case(inputs={'data': 'required_key present'}, expected_output={'result': 'success'}),
    ],
    evaluators=[
        IsInstance(type_name='dict'),
        Contains(value='required_key'),
    ],
)
```

### LLM-as-a-Judge Evaluation

Use an LLM to evaluate subjective qualities like accuracy or helpfulness:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge

dataset = Dataset(
    name='llm_judge_test',
    cases=[
        Case(inputs='What is the capital of France?', expected_output='Paris'),
    ],
    evaluators=[
        LLMJudge(
            rubric='Response is accurate and helpful',
            include_input=True,
            model='anthropic:claude-sonnet-4-6',
        )
    ],
)
```

### Performance Testing

Ensure your system meets performance requirements:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import MaxDuration

dataset = Dataset(
    name='performance_test',
    cases=[
        Case(inputs='test input', expected_output='test output'),
    ],
    evaluators=[
        MaxDuration(seconds=2.0),
    ],
)
```

## Next Steps

Explore the documentation to learn more:

-   **[Core Concepts](/docs/ai/evals/core-concepts)** - Understand the data model and evaluation flow
-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** - Learn about all available evaluators
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write your own evaluation logic
-   **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Save, load, and generate datasets
-   **[Examples](/docs/ai/evals/examples/simple-validation)** - Practical examples for common scenarios

---

# [Concurrency & Performance](https://pydantic.dev/docs/ai/evals/how-to/concurrency/)

# Concurrency & Performance

Control how evaluation cases are executed in parallel.

By default, Pydantic Evals runs all cases concurrently to maximize throughput. You can control this behavior using the `max_concurrency` parameter.

## Basic Usage

```python
from pydantic_evals import Case, Dataset


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='concurrency_demo', cases=[Case(inputs='test1'), Case(inputs='test2')])

# Run all cases concurrently (default)
report = dataset.evaluate_sync(my_task)

# Limit to 5 concurrent cases
report = dataset.evaluate_sync(my_task, max_concurrency=5)

# Run sequentially (one at a time)
report = dataset.evaluate_sync(my_task, max_concurrency=1)
```

## When to Limit Concurrency

### Rate Limiting

Many APIs have rate limits that restrict concurrent requests:

```python
from pydantic_evals import Case, Dataset


async def my_llm_task(inputs: str) -> str:
    return f'LLM Result: {inputs}'


dataset = Dataset(name='rate_limit_demo', cases=[Case(inputs='test1')])

# If your API allows 10 requests/second
report = dataset.evaluate_sync(
    my_llm_task,
    max_concurrency=10,
)
```

### Resource Constraints

Limit concurrency to avoid overwhelming system resources:

```python
from pydantic_evals import Case, Dataset


def heavy_computation(inputs: str) -> str:
    return f'Heavy: {inputs}'


def db_query_task(inputs: str) -> str:
    return f'DB: {inputs}'


dataset = Dataset(name='resource_constraints', cases=[Case(inputs='test1')])

# Memory-intensive operations
report = dataset.evaluate_sync(
    heavy_computation,
    max_concurrency=2,  # Only 2 at a time
)

# Database connection pool limits
report = dataset.evaluate_sync(
    db_query_task,
    max_concurrency=5,  # Match connection pool size
)
```

### Debugging

Run sequentially to see clear error traces:

```python
from pydantic_evals import Case, Dataset


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='debug_demo', cases=[Case(inputs='test1')])

# Easier to debug
report = dataset.evaluate_sync(
    my_task,
    max_concurrency=1,
)
```

## Performance Comparison

Here's an example showing the performance difference:

concurrency\_example.py

```python
import asyncio

from pydantic_evals import Case, Dataset

# Create a dataset with multiple test cases
dataset = Dataset(
    name='performance_comparison',
    cases=[
        Case(
            name=f'case_{i}',
            inputs=i,
            expected_output=i * 2,
        )
        for i in range(10)
    ]
)


async def slow_task(input_value: int) -> int:
    """Simulates a slow operation (e.g., API call)."""
    await asyncio.sleep(0.1)  # 100ms per case
    return input_value * 2


# Unlimited concurrency: ~0.1s total (all cases run in parallel)
report = dataset.evaluate_sync(slow_task)

# Limited concurrency: ~0.5s total (2 at a time, 5 batches)
report = dataset.evaluate_sync(slow_task, max_concurrency=2)

# Sequential: ~1.0s total (one at a time, 10 cases)
report = dataset.evaluate_sync(slow_task, max_concurrency=1)
```

## Concurrency with Evaluators

Both task execution and evaluator execution happen concurrently by default:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(
    name='evaluator_concurrency',
    cases=[Case(inputs=f'test{i}') for i in range(100)],  # 100 cases
    evaluators=[
        LLMJudge(rubric='Quality check'),  # Makes API calls
    ],
)

# Both task and evaluator run with controlled concurrency
report = dataset.evaluate_sync(
    my_task,
    max_concurrency=10,
)
```

If your evaluators are expensive (e.g., [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge)), limiting concurrency helps manage:

-   API rate limits
-   Cost (fewer concurrent API calls)
-   Memory usage

## Async vs Sync

Both sync and async evaluation support concurrency control:

### Sync API

```python
from pydantic_evals import Case, Dataset


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='sync_demo', cases=[Case(inputs='test1')])

# Runs async operations internally with controlled concurrency
report = dataset.evaluate_sync(my_task, max_concurrency=10)
```

### Async API

```python
from pydantic_evals import Case, Dataset


async def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


async def run_evaluation():
    dataset = Dataset(name='async_demo', cases=[Case(inputs='test1')])
    # Same behavior, but in async context
    report = await dataset.evaluate(my_task, max_concurrency=10)
    return report
```

## Monitoring Concurrency

Track execution to optimize settings:

```python
import time

from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='monitoring', cases=[Case(inputs=f'test{i}') for i in range(10)])

t0 = time.time()
report = dataset.evaluate_sync(task, max_concurrency=10)
duration = time.time() - t0

num_cases = len(report.cases) + len(report.failures)
avg_duration = duration / num_cases

print(f'Total: {duration:.2f}s')
#> Total: 0.01s
print(f'Cases: {num_cases}')
#> Cases: 10
print(f'Avg per case: {avg_duration:.2f}s')
#> Avg per case: 0.00s
print(f'Effective concurrency: ~{num_cases * avg_duration / duration:.1f}')
#> Effective concurrency: ~1.0
```

## Handling Rate Limits

If you hit rate limits, the evaluation will fail. Use retry strategies:

```python
from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='rate_limit_handling', cases=[Case(inputs='test1')])

# Reduce concurrency to avoid rate limits
report = dataset.evaluate_sync(
    task,
    max_concurrency=5,  # Stay under rate limit
)
```

See [Retry Strategies](/docs/ai/evals/how-to/retry-strategies) for handling transient failures.

## Next Steps

-   **[Retry Strategies](/docs/ai/evals/how-to/retry-strategies)** - Handle transient failures
-   **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Work with large datasets
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - Monitor performance

---

# [Dataset Management](https://pydantic.dev/docs/ai/evals/how-to/dataset-management/)

# Dataset Management

Create, save, load, and generate evaluation datasets.

## Creating Datasets

### From Code

Define datasets directly in Python:

```python
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected, IsInstance

dataset = Dataset[str, str, Any](
    name='my_eval_suite',
    cases=[
        Case(
            name='test_1',
            inputs='input 1',
            expected_output='output 1',
        ),
        Case(
            name='test_2',
            inputs='input 2',
            expected_output='output 2',
        ),
    ],
    evaluators=[
        IsInstance(type_name='str'),
        EqualsExpected(),
    ],
)
```

### Adding Cases Dynamically

```python
from typing import Any

from pydantic_evals import Dataset
from pydantic_evals.evaluators import IsInstance

dataset = Dataset[str, str, Any](name='dynamic_dataset', cases=[], evaluators=[])

# Add cases one at a time
dataset.add_case(
    name='dynamic_case',
    inputs='test input',
    expected_output='test output',
)

# Add evaluators
dataset.add_evaluator(IsInstance(type_name='str'))
```

## Saving Datasets

Detailed Serialization Guide

For complete details on serialization formats, JSON schema generation, and custom evaluators, see [Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization).

### Save to YAML

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')])
dataset.to_file('my_dataset.yaml')

# Also saves schema file: my_dataset_schema.json
```

Output (`my_dataset.yaml`):

```yaml
# yaml-language-server: $schema=my_dataset_schema.json
name: my_eval_suite
cases:
- name: test_1
  inputs: input 1
  expected_output: output 1
  evaluators:
  - EqualsExpected
- name: test_2
  inputs: input 2
  expected_output: output 2
  evaluators:
  - EqualsExpected
evaluators:
- IsInstance: str
```

### Save to JSON

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')])
dataset.to_file('my_dataset.json')

# Also saves schema file: my_dataset_schema.json
```

### Custom Schema Path

```python
from pathlib import Path
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')])

# Custom schema location
Path('data').mkdir(exist_ok=True)
Path('data/schemas').mkdir(parents=True, exist_ok=True)
dataset.to_file(
    'data/my_dataset.yaml',
    schema_path='schemas/my_schema.json',
)

# No schema file
dataset.to_file('my_dataset.yaml', schema_path=None)
```

## Loading Datasets

### From YAML/JSON

```python
from typing import Any

from pydantic_evals import Dataset

# Infers format from extension
dataset = Dataset[str, str, Any].from_file('my_dataset.yaml')
dataset = Dataset[str, str, Any].from_file('my_dataset.json')

# Explicit format for non-standard extensions
dataset = Dataset[str, str, Any].from_file('data.txt', fmt='yaml')
```

### From String

```python
from typing import Any

from pydantic_evals import Dataset

yaml_content = """
name: my_tests
cases:
- name: test
  inputs: hello
  expected_output: HELLO
evaluators:
- EqualsExpected
"""

dataset = Dataset[str, str, Any].from_text(yaml_content, fmt='yaml')
```

### From Dict

```python
from typing import Any

from pydantic_evals import Dataset

data = {
    'name': 'my_tests',
    'cases': [
        {
            'name': 'test',
            'inputs': 'hello',
            'expected_output': 'HELLO',
        },
    ],
    'evaluators': [{'EqualsExpected': {}}],
}

dataset = Dataset[str, str, Any].from_dict(data)
```

### With Custom Evaluators

When loading datasets that use custom evaluators, you must pass them to `from_file()`:

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class MyCustomEvaluator(Evaluator):
    threshold: float = 0.5

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# Load with custom evaluator registry
dataset = Dataset[str, str, Any].from_file(
    'my_dataset.yaml',
    custom_evaluator_types=[MyCustomEvaluator],
)
```

For complete details on serialization with custom evaluators, see [Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization).

## Generating Datasets

Pydantic Evals allows you to generate test datasets using LLMs with [`generate_dataset`](/docs/ai/api/pydantic_evals/generation/#pydantic_evals.generation.generate_dataset).

Datasets can be generated in either JSON or YAML format, in both cases a JSON schema file is generated alongside the dataset and referenced in the dataset, so you should get type checking and auto-completion in your editor.

generate\_dataset\_example.py

```python
from __future__ import annotations

from pathlib import Path

from pydantic import BaseModel, Field

from pydantic_evals import Dataset
from pydantic_evals.generation import generate_dataset


class QuestionInputs(BaseModel, use_attribute_docstrings=True):  # (1)
  """Model for question inputs."""

  question: str
  """A question to answer"""
  context: str | None = None
  """Optional context for the question"""


class AnswerOutput(BaseModel, use_attribute_docstrings=True):  # (2)
  """Model for expected answer outputs."""

  answer: str
  """The answer to the question"""
  confidence: float = Field(ge=0, le=1)
  """Confidence level (0-1)"""


class MetadataType(BaseModel, use_attribute_docstrings=True):  # (3)
  """Metadata model for test cases."""

  difficulty: str
  """Difficulty level (easy, medium, hard)"""
  category: str
  """Question category"""


async def main():
  dataset = await generate_dataset(  # (4)
      dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType],
      n_examples=2,
      extra_instructions="""
      Generate question-answer pairs about world capitals and landmarks.
      Make sure to include both easy and challenging questions.
      """,
  )
  output_file = Path('questions_cases.yaml')
  dataset.to_file(output_file)  # (5)
  print(output_file.read_text(encoding='utf-8'))
  """
  # yaml-language-server: $schema=questions_cases_schema.json
  name: generated
  cases:
  - name: Easy Capital Question
    inputs:
      question: What is the capital of France?
      context: null
    metadata:
      difficulty: easy
      category: Geography
    expected_output:
      answer: Paris
      confidence: 0.95
    evaluators:
    - EqualsExpected
  - name: Challenging Landmark Question
    inputs:
      question: Which world-famous landmark is located on the banks of the Seine River?
      context: null
    metadata:
      difficulty: hard
      category: Landmarks
    expected_output:
      answer: Eiffel Tower
      confidence: 0.9
    evaluators:
    - EqualsExpected
  evaluators: []
  report_evaluators: []
  """
```

Define the schema for the inputs to the task.

Define the schema for the expected outputs of the task.

Define the schema for the metadata of the test cases.

Call [`generate_dataset`](/docs/ai/api/pydantic_evals/generation/#pydantic_evals.generation.generate_dataset) to create a [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) with 2 cases confirming to the schema.

Save the dataset to a YAML file, this will also write `questions_cases_schema.json` with the schema JSON schema for `questions_cases.yaml` to make editing easier. The magic `yaml-language-server` comment is supported by at least vscode, jetbrains/pycharm (more details [here](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema)).

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main(answer))` to run `main`)_

You can also write datasets as JSON files:

generate\_dataset\_example\_json.py

```python
from pathlib import Path

from pydantic_evals import Dataset
from pydantic_evals.generation import generate_dataset

from generate_dataset_example import AnswerOutput, MetadataType, QuestionInputs


async def main():
  dataset = await generate_dataset(  # (1)
      dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType],
      n_examples=2,
      extra_instructions="""
      Generate question-answer pairs about world capitals and landmarks.
      Make sure to include both easy and challenging questions.
      """,
  )
  output_file = Path('questions_cases.json')
  dataset.to_file(output_file)  # (2)
  print(output_file.read_text(encoding='utf-8'))
  """
  {
    "$schema": "questions_cases_schema.json",
    "name": "generated",
    "cases": [
      {
        "name": "Easy Capital Question",
        "inputs": {
          "question": "What is the capital of France?",
          "context": null
        },
        "metadata": {
          "difficulty": "easy",
          "category": "Geography"
        },
        "expected_output": {
          "answer": "Paris",
          "confidence": 0.95
        },
        "evaluators": [
          "EqualsExpected"
        ]
      },
      {
        "name": "Challenging Landmark Question",
        "inputs": {
          "question": "Which world-famous landmark is located on the banks of the Seine River?",
          "context": null
        },
        "metadata": {
          "difficulty": "hard",
          "category": "Landmarks"
        },
        "expected_output": {
          "answer": "Eiffel Tower",
          "confidence": 0.9
        },
        "evaluators": [
          "EqualsExpected"
        ]
      }
    ],
    "evaluators": [],
    "report_evaluators": []
  }
  """
```

Generate the [`Dataset`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset) exactly as above.

Save the dataset to a JSON file, this will also write `questions_cases_schema.json` with th JSON schema for `questions_cases.json`. This time the `$schema` key is included in the JSON file to define the schema for IDEs to use while you edit the file, there's no formal spec for this, but it works in vscode and pycharm and is discussed at length in [json-schema-org/json-schema-spec#828](https://github.com/json-schema-org/json-schema-spec/issues/828).

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main(answer))` to run `main`)_

## Type-Safe Datasets

Use generic type parameters for type safety:

```python
from typing_extensions import TypedDict

from pydantic_evals import Case, Dataset


class MyInput(TypedDict):
    query: str
    max_results: int


class MyOutput(TypedDict):
    results: list[str]


class MyMetadata(TypedDict):
    category: str


# Type-safe dataset
dataset: Dataset[MyInput, MyOutput, MyMetadata] = Dataset(
    name='typed_dataset',
    cases=[
        Case(
            name='test',
            inputs={'query': 'test', 'max_results': 10},
            expected_output={'results': ['a', 'b']},
            metadata={'category': 'search'},
        ),
    ],
)
```

## Schema Generation

Generate JSON Schema for IDE support:

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_eval_suite', cases=[Case(name='test', inputs='example')])

# Save with schema
dataset.to_file('my_dataset.yaml')  # Creates my_dataset_schema.json

# Schema enables:
# - Autocomplete in VS Code/PyCharm
# - Validation while editing
# - Inline documentation
```

Manual schema generation:

```python
import json
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class MyCustomEvaluator(Evaluator):
    threshold: float = 0.5

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


schema = Dataset[str, str, Any].model_json_schema_with_evaluators(
    custom_evaluator_types=[MyCustomEvaluator],
)
print(json.dumps(schema, indent=2)[:66] + '...')
"""
{
  "$defs": {
    "Case": {
      "additionalProperties": false,
...
"""
```

## Best Practices

### 1\. Use Clear Names

```python
from pydantic_evals import Case

# Good
Case(name='uppercase_basic_ascii', inputs='hello')
Case(name='uppercase_unicode_emoji', inputs='hello 😀')
Case(name='uppercase_empty_string', inputs='')

# Bad
Case(name='test1', inputs='hello')
Case(name='test2', inputs='world')
Case(name='test3', inputs='foo')
```

### 2\. Organize by Difficulty

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='organized_by_difficulty',
    cases=[
        Case(name='easy_1', inputs='test', metadata={'difficulty': 'easy'}),
        Case(name='easy_2', inputs='test2', metadata={'difficulty': 'easy'}),
        Case(name='medium_1', inputs='test3', metadata={'difficulty': 'medium'}),
        Case(name='hard_1', inputs='test4', metadata={'difficulty': 'hard'}),
    ],
)
```

### 3\. Start Small, Grow Gradually

```python
from pydantic_evals import Case, Dataset

# Start with representative cases
dataset = Dataset(
    name='starting_small',
    cases=[
        Case(name='happy_path', inputs='test'),
        Case(name='edge_case', inputs=''),
        Case(name='error_case', inputs='invalid'),
    ],
)

# Add more as you find issues
dataset.add_case(name='newly_discovered_edge_case', inputs='edge')
```

### 4\. Use Case-specific Evaluators Where Appropriate

Case-specific evaluators let different cases have different evaluation criteria, which is essential for comprehensive "test coverage". Rather than trying to write one-size-fits-all evaluators, you can specify exactly what "good" looks like for each scenario. This is particularly powerful with [`LLMJudge`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) evaluators where you can describe nuanced requirements per case, making it easy to build and maintain golden datasets. See [Case-specific evaluators](/docs/ai/evals/evaluators/overview#case-specific-evaluators) for detailed guidance.

### 5\. Separate Datasets by Purpose

```python
from typing import Any

from pydantic_evals import Case, Dataset

# First create some test datasets
for name in ['smoke_tests', 'comprehensive_tests', 'regression_tests']:
    test_dataset = Dataset[str, Any, Any](name=name, cases=[Case(name='test', inputs='example')])
    test_dataset.to_file(f'{name}.yaml')

# Smoke tests (fast, critical paths)
smoke_tests = Dataset[str, Any, Any].from_file('smoke_tests.yaml')

# Comprehensive tests (slow, thorough)
comprehensive = Dataset[str, Any, Any].from_file('comprehensive_tests.yaml')

# Regression tests (specific bugs)
regression = Dataset[str, Any, Any].from_file('regression_tests.yaml')
```

## Next Steps

-   **[Dataset Serialization](/docs/ai/evals/how-to/dataset-serialization)** - In-depth guide to saving and loading datasets
-   **[Generating Datasets](#generating-datasets)** - Use LLMs to generate test cases
-   **[Examples: Simple Validation](/docs/ai/evals/examples/simple-validation)** - Practical examples

---

# [Dataset Serialization](https://pydantic.dev/docs/ai/evals/how-to/dataset-serialization/)

# Dataset Serialization

Learn how to save and load datasets in different formats, with support for custom evaluators and IDE integration.

Pydantic Evals supports serializing datasets to files in two formats:

-   **YAML** (`.yaml`, `.yml`) - Human-readable, great for version control
-   **JSON** (`.json`) - Structured, machine-readable

Both formats support:

-   Automatic JSON schema generation for IDE autocomplete and validation
-   Custom evaluator serialization/deserialization
-   Type-safe loading with generic parameters

## YAML Format

YAML is the recommended format for most use cases due to its readability and compact syntax.

### Basic Example

```python
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected, IsInstance

# Create a dataset with typed parameters
dataset = Dataset[str, str, Any](
    name='my_tests',
    cases=[
        Case(
            name='test_1',
            inputs='hello',
            expected_output='HELLO',
        ),
    ],
    evaluators=[
        IsInstance(type_name='str'),
        EqualsExpected(),
    ],
)

# Save to YAML
dataset.to_file('my_tests.yaml')
```

This creates two files:

1.  **`my_tests.yaml`** - The dataset
2.  **`my_tests_schema.json`** - JSON schema for IDE support

### YAML Output

```yaml
# yaml-language-server: $schema=my_tests_schema.json
name: my_tests
cases:
- name: test_1
  inputs: hello
  expected_output: HELLO
evaluators:
- IsInstance: str
- EqualsExpected
```

### JSON Schema for IDEs

The first line references the schema file:

```yaml
# yaml-language-server: $schema=my_tests_schema.json
```

This enables:

-   ✅ **Autocomplete** in VS Code, PyCharm, and other editors
-   ✅ **Inline validation** while editing
-   ✅ **Documentation tooltips** for fields
-   ✅ **Error highlighting** for invalid data

Editor Support

The `yaml-language-server` comment is supported by:

-   VS Code (with YAML extension)
-   JetBrains IDEs (PyCharm, IntelliJ, etc.)
-   Most editors with YAML language server support

See the [YAML Language Server docs](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema) for more details.

### Loading from YAML

```python
from pathlib import Path
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected, IsInstance

# First create and save the dataset
Path('my_tests.yaml').parent.mkdir(exist_ok=True)
dataset = Dataset[str, str, Any](
    name='my_tests',
    cases=[Case(name='test_1', inputs='hello', expected_output='HELLO')],
    evaluators=[IsInstance(type_name='str'), EqualsExpected()],
)
dataset.to_file('my_tests.yaml')

# Load the dataset with type parameters
dataset = Dataset[str, str, Any].from_file('my_tests.yaml')


def my_task(text: str) -> str:
    return text.upper()


# Run evaluation
report = dataset.evaluate_sync(my_task)
```

## JSON Format

JSON format is useful for programmatic generation or when strict structure is required.

### Basic Example

```python
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

dataset = Dataset[str, str, Any](
    name='my_tests',
    cases=[
        Case(name='test_1', inputs='hello', expected_output='HELLO'),
    ],
    evaluators=[EqualsExpected()],
)

# Save to JSON
dataset.to_file('my_tests.json')
```

### JSON Output

```json
{
  "$schema": "my_tests_schema.json",
  "name": "my_tests",
  "cases": [
    {
      "name": "test_1",
      "inputs": "hello",
      "expected_output": "HELLO"
    }
  ],
  "evaluators": [
    "EqualsExpected"
  ]
}
```

The `$schema` key at the top enables IDE support similar to YAML.

### Loading from JSON

```python
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

# First create and save the dataset
dataset = Dataset[str, str, Any](
    name='my_tests',
    cases=[Case(name='test_1', inputs='hello', expected_output='HELLO')],
    evaluators=[EqualsExpected()],
)
dataset.to_file('my_tests.json')

# Load from JSON
dataset = Dataset[str, str, Any].from_file('my_tests.json')
```

## Schema Generation

### Automatic Schema Creation

By default, `to_file()` creates a JSON schema file alongside your dataset:

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')])

# Creates both my_tests.yaml AND my_tests_schema.json
dataset.to_file('my_tests.yaml')
```

### Custom Schema Location

```python
from pathlib import Path
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')])

# Create directories
Path('data').mkdir(exist_ok=True)

# Custom schema filename (relative to dataset file location)
dataset.to_file(
    'data/my_tests.yaml',
    schema_path='my_schema.json',
)

# No schema file
dataset.to_file('my_tests.yaml', schema_path=None)
```

### Schema Path Templates

Use `{stem}` to reference the dataset filename:

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')])

# Creates: my_tests.yaml and my_tests.schema.json
dataset.to_file(
    'my_tests.yaml',
    schema_path='{stem}.schema.json',
)
```

### Manual Schema Generation

Generate a schema without saving the dataset:

```python
import json
from typing import Any

from pydantic_evals import Dataset

# Get schema as dictionary for a specific dataset type
schema = Dataset[str, str, Any].model_json_schema_with_evaluators()

# Save manually
with open('custom_schema.json', 'w', encoding='utf-8') as f:
    json.dump(schema, f, indent=2)
```

## Custom Evaluators

Custom evaluators require special handling during serialization and deserialization.

### Requirements

Custom evaluators must:

1.  Be decorated with `@dataclass`
2.  Inherit from `Evaluator`
3.  Be passed to both `to_file()` and `from_file()`

### Complete Example

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CustomThreshold(Evaluator):
    """Check if output length exceeds a threshold."""

    min_length: int
    max_length: int = 100

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        length = len(str(ctx.output))
        return self.min_length <= length <= self.max_length


# Create dataset with custom evaluator
dataset = Dataset[str, str, Any](
    name='custom_threshold_tests',
    cases=[
        Case(
            name='test_length',
            inputs='example',
            expected_output='long result',
            evaluators=[
                CustomThreshold(min_length=5, max_length=20),
            ],
        ),
    ],
)

# Save with custom evaluator types
dataset.to_file(
    'dataset.yaml',
    custom_evaluator_types=[CustomThreshold],
)
```

### Saved YAML

```yaml
# yaml-language-server: $schema=dataset_schema.json
cases:
- name: test_length
  inputs: example
  expected_output: long result
  evaluators:
  - CustomThreshold:
      min_length: 5
      max_length: 20
```

### Loading with Custom Evaluators

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CustomThreshold(Evaluator):
    """Check if output length exceeds a threshold."""

    min_length: int
    max_length: int = 100

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        length = len(str(ctx.output))
        return self.min_length <= length <= self.max_length


# First create and save the dataset
dataset = Dataset[str, str, Any](
    name='custom_threshold_tests',
    cases=[
        Case(
            name='test_length',
            inputs='example',
            expected_output='long result',
            evaluators=[CustomThreshold(min_length=5, max_length=20)],
        ),
    ],
)
dataset.to_file('dataset.yaml', custom_evaluator_types=[CustomThreshold])

# Load with custom evaluator registry
dataset = Dataset[str, str, Any].from_file(
    'dataset.yaml',
    custom_evaluator_types=[CustomThreshold],
)
```

Important

You must pass `custom_evaluator_types` to **both** `to_file()` and `from_file()`.

-   `to_file()`: Includes the evaluator in the JSON schema
-   `from_file()`: Registers the evaluator for deserialization

## Evaluator Serialization Formats

Evaluators can be serialized in three forms:

### 1\. Name Only (No Parameters)

```yaml
evaluators:
- EqualsExpected
- IsInstance: str  # Using default parameter
```

### 2\. Single Parameter (Short Form)

```yaml
evaluators:
- IsInstance: str
- Contains: "required text"
- MaxDuration: 2.0
```

### 3\. Multiple Parameters (Dict Form)

```yaml
evaluators:
- CustomThreshold:
    min_length: 5
    max_length: 20
- LLMJudge:
    rubric: "Response is accurate"
    model: "openai:gpt-5"
    include_input: true
```

## Format Comparison

Feature

YAML

JSON

Human readable

✅ Excellent

⚠️ Good

Comments

✅ Yes

❌ No

Compact

✅ Yes

⚠️ Verbose

Machine parsing

✅ Good

✅ Excellent

IDE support

✅ Yes

✅ Yes

Version control

✅ Clean diffs

⚠️ Noisy diffs

**Recommendation**: Use YAML for most cases, JSON for programmatic generation.

## Advanced: Evaluator Serialization Name

Customize how your evaluator appears in serialized files:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class VeryLongDescriptiveEvaluatorName(Evaluator):
    @classmethod
    def get_serialization_name(cls) -> str:
        return 'ShortName'

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True
```

In YAML:

```yaml
evaluators:
- ShortName  # Instead of VeryLongDescriptiveEvaluatorName
```

## Troubleshooting

### Schema Not Found in IDE

**Problem**: YAML file doesn't show autocomplete

**Solutions**:

1.  **Check the schema path** in the first line of YAML:
    
    ```yaml
    # yaml-language-server: $schema=correct_schema_name.json
    ```
    
2.  **Verify schema file exists** in the same directory
    
3.  **Restart the language server** in your IDE
    
4.  **Install YAML extension** (VS Code: "YAML" by Red Hat)
    

### Custom Evaluator Not Found

**Problem**: `ValueError: Unknown evaluator name: 'CustomEvaluator'`

**Solution**: Pass `custom_evaluator_types` when loading:

```python
from dataclasses import dataclass
from typing import Any

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CustomEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# First create and save with custom evaluator
dataset = Dataset[str, str, Any](
    name='custom_eval_tests',
    cases=[Case(inputs='test', evaluators=[CustomEvaluator()])],
)
dataset.to_file('tests.yaml', custom_evaluator_types=[CustomEvaluator])

# Load with custom evaluator types
dataset = Dataset[str, str, Any].from_file(
    'tests.yaml',
    custom_evaluator_types=[CustomEvaluator],  # Required!
)
```

### Format Inference Failed

**Problem**: `ValueError: Cannot infer format from extension`

**Solution**: Specify format explicitly:

```python
from typing import Any

from pydantic_evals import Case, Dataset

dataset = Dataset[str, str, Any](name='my_tests', cases=[Case(inputs='test')])

# Explicit format for unusual extensions
dataset.to_file('data.txt', fmt='yaml')
dataset_loaded = Dataset[str, str, Any].from_file('data.txt', fmt='yaml')
```

### Schema Generation Error

**Problem**: Custom evaluator causes schema generation to fail

**Solution**: Ensure evaluator is a proper dataclass:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


# ✅ Correct
@dataclass
class MyEvaluator(Evaluator):
    value: int

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# ❌ Wrong: Missing @dataclass
class BadEvaluator(Evaluator):
    def __init__(self, value: int):
        self.value = value

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True
```

## Next Steps

-   **[Dataset Management](/docs/ai/evals/how-to/dataset-management)** - Creating and organizing datasets
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Write custom evaluation logic
-   **[Core Concepts](/docs/ai/evals/core-concepts)** - Understand the data model

---

# [Case Lifecycle Hooks](https://pydantic.dev/docs/ai/evals/how-to/lifecycle/)

# Case Lifecycle Hooks

Control per-case setup, context preparation, and teardown during evaluation using [`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle).

[`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) provides hooks at each stage of case evaluation. You pass a lifecycle **class** (not an instance) to [`Dataset.evaluate`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate), and a new instance is created for each case, so instance attributes naturally hold case-specific state.

## Evaluation Flow

Each case follows this flow:

1.  **`setup()`** -- called before task execution
2.  **Task runs**
3.  **`prepare_context()`** -- called after task, before evaluators
4.  **Evaluators run**
5.  **`teardown()`** -- called after evaluators complete, or during cleanup if the case is interrupted

## Per-Case Setup and Teardown

Use `setup()` and `teardown()` when each case needs its own environment -- for example, creating a database, starting a service, or preparing fixtures driven by case metadata. Since a new lifecycle instance is created for each case, instance attributes are naturally case-scoped:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators.context import EvaluatorContext
from pydantic_evals.lifecycle import CaseLifecycle
from pydantic_evals.reporting import ReportCase, ReportCaseFailure


class SetupFromMetadata(CaseLifecycle[str, str, dict]):
    async def setup(self) -> None:
        prefix = (self.case.metadata or {}).get('prefix', '')
        self.prefix = prefix

    async def prepare_context(
        self, ctx: EvaluatorContext[str, str, dict]
    ) -> EvaluatorContext[str, str, dict]:
        ctx.metrics['prefix_length'] = len(self.prefix)
        return ctx

    async def teardown(
        self,
        result: ReportCase[str, str, dict] | ReportCaseFailure[str, str, dict] | None,
    ) -> None:
        pass  # Clean up resources here


dataset = Dataset(
    name='setup_teardown',
    cases=[
        Case(name='no_prefix', inputs='hello', metadata={'prefix': ''}),
        Case(name='with_prefix', inputs='hello', metadata={'prefix': 'PREFIX:'}),
    ]
)

report = dataset.evaluate_sync(lambda inputs: inputs.upper(), lifecycle=SetupFromMetadata)

metrics = {c.name: c.metrics for c in report.cases}
print(metrics['no_prefix']['prefix_length'])
#> 0
print(metrics['with_prefix']['prefix_length'])
#> 7
```

The case metadata drives per-case behavior without needing custom [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case) subclasses or serialization.

### Conditional Teardown

The `teardown()` hook receives the full result, so you can vary cleanup logic based on success or failure -- for example, keeping test environments up for manual inspection when a case fails. The `result` can be `None` if evaluation is interrupted before the case produces a report result, so handle that branch when your cleanup depends on the case outcome:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.lifecycle import CaseLifecycle
from pydantic_evals.reporting import ReportCase, ReportCaseFailure

cleaned_up: list[str] = []


class ConditionalCleanup(CaseLifecycle[str, str, dict]):
    async def setup(self) -> None:
        self.resource_id = self.case.name

    async def teardown(
        self,
        result: ReportCase[str, str, dict] | ReportCaseFailure[str, str, dict] | None,
    ) -> None:
        keep_on_failure = (self.case.metadata or {}).get('keep_on_failure', False)
        if result is None:
            # abnormal exit
            cleaned_up.append(self.resource_id)
        elif isinstance(result, ReportCaseFailure) and keep_on_failure:
            # case failed
            pass  # Keep resource for inspection
        else:
            # case succeeded
            cleaned_up.append(self.resource_id)


dataset = Dataset(
    name='conditional_cleanup',
    cases=[
        Case(name='success_case', inputs='hello', metadata={'keep_on_failure': True}),
        Case(name='failure_case', inputs='fail', metadata={'keep_on_failure': True}),
    ]
)


def task(inputs: str) -> str:
    if inputs == 'fail':
        raise ValueError('intentional failure')
    return inputs.upper()


report = dataset.evaluate_sync(task, max_concurrency=1, lifecycle=ConditionalCleanup)

print(cleaned_up)
#> ['success_case']
```

## Preparing Evaluator Context

The `prepare_context()` hook runs after the task completes but before evaluators see the context. This can be used to add metrics or attributes based on the task output, span tree, or any other state -- for example, deriving metrics from instrumented spans (like tool call counts or API latency), or computing values from external resources set up during `setup()`:

```python
from dataclasses import dataclass

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.lifecycle import CaseLifecycle


class EnrichMetrics(CaseLifecycle):
    async def prepare_context(self, ctx: EvaluatorContext) -> EvaluatorContext:
        ctx.metrics['output_length'] = len(str(ctx.output))
        return ctx


@dataclass
class CheckLength(Evaluator):
    max_length: int = 50

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return ctx.metrics.get('output_length', 0) <= self.max_length


dataset = Dataset(
    name='context_enrichment',
    cases=[Case(name='short', inputs='hi'), Case(name='long', inputs='hello world')],
    evaluators=[CheckLength()],
)

report = dataset.evaluate_sync(lambda inputs: inputs.upper(), lifecycle=EnrichMetrics)

for case in report.cases:
    print(f'{case.name}: output_length={case.metrics["output_length"]}')
    #> short: output_length=2
    #> long: output_length=11
```

## Type Parameters

[`CaseLifecycle`](/docs/ai/api/pydantic_evals/lifecycle/#pydantic_evals.lifecycle.CaseLifecycle) is generic over the same three type parameters as [`Case`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Case): `InputsT`, `OutputT`, and `MetadataT`. All three default to `Any`, so you can omit them when your hooks don't need type-specific access:

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators.context import EvaluatorContext
from pydantic_evals.lifecycle import CaseLifecycle


# Works with any dataset -- no type parameters needed
class GenericMetricEnricher(CaseLifecycle):
    async def prepare_context(self, ctx: EvaluatorContext) -> EvaluatorContext:
        ctx.metrics['custom'] = 42
        return ctx


dataset = Dataset(name='generic_lifecycle', cases=[Case(inputs='test')])
report = dataset.evaluate_sync(lambda inputs: inputs, lifecycle=GenericMetricEnricher)

print(report.cases[0].metrics['custom'])
#> 42
```

## Next Steps

-   **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** -- Recording metrics inside tasks
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Using enriched metrics in evaluators
-   **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** -- Analyzing execution traces

---

# [Logfire Integration](https://pydantic.dev/docs/ai/evals/how-to/logfire-integration/)

# Logfire Integration

Visualize and analyze evaluation results using Pydantic Logfire.

Pydantic Evals uses OpenTelemetry to record traces of the evaluation process. These traces contain all the information from your evaluation reports, plus full tracing from the execution of your task function.

You can send these traces to any OpenTelemetry-compatible backend, including [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/).

## Installation

Install the optional logfire dependency:

Terminal

```bash
pip install 'pydantic-evals[logfire]'
```

## Basic Setup

Configure Logfire before running evaluations:

basic\_logfire\_setup.py

```python
import logfire

from pydantic_evals import Case, Dataset

# Configure Logfire
logfire.configure(
  send_to_logfire='if-token-present',  # (1)
)


# Your evaluation code
def my_task(inputs: str) -> str:
  return f'result for {inputs}'


dataset = Dataset(name='logfire_demo', cases=[Case(name='test', inputs='example')])
report = dataset.evaluate_sync(my_task)
```

Sends data to Logfire only if the `LOGFIRE_TOKEN` environment variable is set

That's it! Your evaluation traces will now appear in the Logfire web UI as long as you have the `LOGFIRE_TOKEN` environment variable set.

## What Gets Sent to Logfire

When you run an evaluation, Logfire receives:

1.  **Evaluation metadata**
    1.  Dataset name
    2.  Number of cases
    3.  Evaluator names
2.  **Per-case data**
    1.  Inputs and outputs
    2.  Expected outputs
    3.  Metadata
    4.  Execution duration
3.  **Evaluation results**
    1.  Scores, assertions, and labels
    2.  Reasons (if included)
    3.  Evaluator failures
4.  **Task execution traces**
    1.  All OpenTelemetry spans from your task function
    2.  Tool calls (for Pydantic AI agents)
    3.  API calls, database queries, etc.

## Viewing Results in Logfire

### Evaluation Overview

Logfire provides a special table view for evaluation results on the root evaluation span:

![Logfire Evals Overview](/docs/ai/img/logfire-evals-overview.png)

This view shows:

-   Case names
-   Pass/fail status
-   Scores and assertions
-   Execution duration
-   Quick filtering and sorting

### Individual Case Details

Click any case to see detailed inputs and outputs:

![Logfire Evals Case](/docs/ai/img/logfire-evals-case.png)

### Full Trace View

View the complete execution trace including all spans generated during evaluation:

![Logfire Evals Case Trace](/docs/ai/img/logfire-evals-case-trace.png)

This is especially useful for:

-   Debugging failed cases
-   Understanding performance bottlenecks
-   Analyzing tool usage patterns
-   Writing span-based evaluators

## Analyzing Traces

### Comparing Runs

Run the same evaluation multiple times and compare in Logfire:

```python
from pydantic_evals import Case, Dataset


def original_task(inputs: str) -> str:
    return f'original result for {inputs}'


def improved_task(inputs: str) -> str:
    return f'improved result for {inputs}'


dataset = Dataset(name='comparison', cases=[Case(name='test', inputs='example')])

# Run 1: Original implementation
report1 = dataset.evaluate_sync(original_task)

# Run 2: Improved implementation
report2 = dataset.evaluate_sync(improved_task)

# Compare in Logfire by filtering by timestamp or attributes
```

### Debugging Failed Cases

Find failed cases quickly:

1.  Search for `service_name = 'my_service_evals' AND is_exception` (replace with the actual service name you are using)
2.  View the full span tree to see where the failure occurred
3.  Inspect attributes and logs for error messages

## Span-Based Evaluation

Logfire integration enables powerful span-based evaluators. See [Span-Based Evaluation](/docs/ai/evals/evaluators/span-based) for details.

Example: Verify specific tools were called:

```python
import logfire

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import HasMatchingSpan

logfire.configure(send_to_logfire='if-token-present')


def my_agent(inputs: str) -> str:
    return f'result for {inputs}'


dataset = Dataset(
    name='logfire_demo',
    cases=[Case(name='test', inputs='example')],
    evaluators=[
        HasMatchingSpan(
            query={'name_contains': 'search_tool'},
            evaluation_name='used_search',
        ),
    ],
)

report = dataset.evaluate_sync(my_agent)
```

The span tree is available in both:

-   Your evaluator code (via `ctx.span_tree`)
-   Logfire UI (visual trace view)

## Troubleshooting

### No Data Appearing in Logfire

Check:

1.  **Token is set**: `echo $LOGFIRE_TOKEN`
2.  **Configuration is correct**:
    
    ```python
    import logfire
    
    logfire.configure(send_to_logfire='always')  # Force sending
    ```
    
3.  **Network connectivity**: Check firewall settings
4.  **Project exists**: Verify project name in Logfire UI

### Traces Missing Spans

If some spans are missing:

1.  **Ensure logfire is configured before imports**:
    
    ```python
    import logfire
    
    logfire.configure()  # Must be first
    ```
    
2.  **Check instrumentation**: Ensure your code has enabled all instrumentations you want:
    
    ```python
    import logfire
    
    logfire.instrument_pydantic_ai()
    logfire.instrument_httpx(capture_all=True)
    ```
    

## Best Practices

### 1\. Configure Early

Always configure Logfire before running evaluations:

```python
import logfire

from pydantic_evals import Case, Dataset

logfire.configure(send_to_logfire='if-token-present')


# Now import and run evaluations
def task(inputs: str) -> str:
    return f'result for {inputs}'


dataset = Dataset(name='logfire_demo', cases=[Case(name='test', inputs='example')])
dataset.evaluate_sync(task)
```

### 2\. Use Descriptive Service Names And Environments

```python
import logfire

logfire.configure(
    service_name='rag-pipeline-evals',
    environment='development',
)
```

### 3\. Review Periodically

-   Check Logfire regularly to identify patterns
-   Look for consistently failing cases
-   Analyze performance trends
-   Adjust evaluators based on insights

## Next Steps

-   **[Span-Based Evaluation](/docs/ai/evals/evaluators/span-based)** - Use OpenTelemetry spans in evaluators
-   **[Logfire Documentation](https://logfire.pydantic.dev/docs/guides/web-ui/evals/)** - Complete Logfire guide
-   **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** - Add custom data to traces

---

# [Metrics & Attributes](https://pydantic.dev/docs/ai/evals/how-to/metrics-attributes/)

# Metrics & Attributes

Track custom metrics and attributes during task execution for richer evaluation insights.

While executing evaluation tasks, you can record:

-   **Metrics** - Numeric values (int/float) for quantitative measurements
-   **Attributes** - Any data for qualitative information

These appear in evaluation reports and can be used by evaluators for assessment.

## Recording Metrics

Use [`increment_eval_metric`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.increment_eval_metric) to track numeric values:

```python
from dataclasses import dataclass

from pydantic_evals.dataset import increment_eval_metric


@dataclass
class APIResult:
    output: str
    usage: 'Usage'


@dataclass
class Usage:
    total_tokens: int


def call_api(inputs: str) -> APIResult:
    return APIResult(output=f'Result: {inputs}', usage=Usage(total_tokens=100))


def my_task(inputs: str) -> str:
    # Track API calls
    increment_eval_metric('api_calls', 1)

    result = call_api(inputs)

    # Track tokens used
    increment_eval_metric('tokens_used', result.usage.total_tokens)

    return result.output
```

## Recording Attributes

Use [`set_eval_attribute`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.set_eval_attribute) to store any data:

```python
from pydantic_evals import set_eval_attribute


def process(inputs: str) -> str:
    return f'Processed: {inputs}'


def my_task(inputs: str) -> str:
    # Record which model was used
    set_eval_attribute('model', 'gpt-5.2')

    # Record feature flags
    set_eval_attribute('used_cache', True)
    set_eval_attribute('retry_count', 2)

    # Record structured data
    set_eval_attribute('config', {
        'temperature': 0.7,
        'max_tokens': 100,
    })

    return process(inputs)
```

## Accessing in Evaluators

Metrics and attributes are available in the [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class EfficiencyChecker(Evaluator):
    max_api_calls: int = 5

    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool]:
        # Access metrics
        api_calls = ctx.metrics.get('api_calls', 0)
        tokens_used = ctx.metrics.get('tokens_used', 0)

        # Access attributes
        used_cache = ctx.attributes.get('used_cache', False)

        return {
            'efficient_api_usage': api_calls <= self.max_api_calls,
            'used_caching': used_cache,
            'token_efficient': tokens_used < 1000,
        }
```

## Viewing in Reports

Metrics and attributes appear in report data:

```python
from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='report_viewing', cases=[Case(inputs='test')], evaluators=[])
report = dataset.evaluate_sync(task)

for case in report.cases:
    print(f'{case.name}:')
    #> Case 1:
    print(f'  Metrics: {case.metrics}')
    #>   Metrics: {}
    print(f'  Attributes: {case.attributes}')
    #>   Attributes: {}
```

You can also display them in printed reports:

```python
from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='report_printing', cases=[Case(inputs='test')], evaluators=[])
report = dataset.evaluate_sync(task)

# Metrics and attributes are available but not shown by default
# Access them programmatically or via Logfire

for case in report.cases:
    print(f'\nCase: {case.name}')
    """
    Case: Case 1
    """
    print(f'Metrics: {case.metrics}')
    #> Metrics: {}
    print(f'Attributes: {case.attributes}')
    #> Attributes: {}
```

## Automatic Metrics

When using Pydantic AI and Logfire, some metrics are automatically tracked:

```python
import logfire

from pydantic_ai import Agent

logfire.configure(send_to_logfire='if-token-present')

agent = Agent('openai:gpt-5.2')


async def ai_task(inputs: str) -> str:
    result = await agent.run(inputs)
    return result.output


# Automatically tracked metrics:
# - requests: Number of LLM calls
# - input_tokens: Total input tokens
# - output_tokens: Total output tokens
# - prompt_tokens: Prompt tokens (if available)
# - completion_tokens: Completion tokens (if available)
# - cost: Estimated cost (if using genai-prices)
```

Access these in evaluators:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class CostChecker(Evaluator):
    max_cost: float = 0.01  # $0.01

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        cost = ctx.metrics.get('cost', 0.0)
        return cost <= self.max_cost
```

## Practical Examples

### API Usage Tracking

```python
from dataclasses import dataclass

from pydantic_evals import increment_eval_metric, set_eval_attribute
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


def check_cache(inputs: str) -> str | None:
    return None  # No cache hit for demo


@dataclass
class APIResult:
    text: str
    usage: 'Usage'


@dataclass
class Usage:
    total_tokens: int


async def call_api(inputs: str) -> APIResult:
    return APIResult(text=f'Result: {inputs}', usage=Usage(total_tokens=100))


def save_to_cache(inputs: str, result: str) -> None:
    pass  # Save to cache


async def smart_task(inputs: str) -> str:
    # Try cache first
    if cached := check_cache(inputs):
        set_eval_attribute('cache_hit', True)
        return cached

    set_eval_attribute('cache_hit', False)

    # Call API
    increment_eval_metric('api_calls', 1)
    result = await call_api(inputs)

    increment_eval_metric('tokens', result.usage.total_tokens)

    # Cache result
    save_to_cache(inputs, result.text)

    return result.text


# Evaluate efficiency
@dataclass
class EfficiencyEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]:
        api_calls = ctx.metrics.get('api_calls', 0)
        cache_hit = ctx.attributes.get('cache_hit', False)

        return {
            'used_cache': cache_hit,
            'made_api_call': api_calls > 0,
            'efficiency_score': 1.0 if cache_hit else 0.5,
        }
```

### Tool Usage Tracking

```python
from dataclasses import dataclass

from pydantic_ai import Agent, RunContext
from pydantic_evals import increment_eval_metric, set_eval_attribute
from pydantic_evals.evaluators import Evaluator, EvaluatorContext

agent = Agent('openai:gpt-5.2')


def search(query: str) -> str:
    return f'Search results for: {query}'


def call(endpoint: str) -> str:
    return f'API response from: {endpoint}'


@agent.tool
def search_database(ctx: RunContext, query: str) -> str:
    increment_eval_metric('db_searches', 1)
    set_eval_attribute('last_query', query)
    return search(query)


@agent.tool
def call_api(ctx: RunContext, endpoint: str) -> str:
    increment_eval_metric('api_calls', 1)
    set_eval_attribute('last_endpoint', endpoint)
    return call(endpoint)


# Evaluate tool usage
@dataclass
class ToolUsageEvaluator(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | int]:
        db_searches = ctx.metrics.get('db_searches', 0)
        api_calls = ctx.metrics.get('api_calls', 0)

        return {
            'used_database': db_searches > 0,
            'used_api': api_calls > 0,
            'tool_call_count': db_searches + api_calls,
            'reasonable_tool_usage': (db_searches + api_calls) <= 5,
        }
```

### Performance Tracking

```python
import time
from dataclasses import dataclass

from pydantic_evals import increment_eval_metric, set_eval_attribute
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


async def retrieve_context(inputs: str) -> list[str]:
    return ['context1', 'context2']


async def generate_response(context: list[str], inputs: str) -> str:
    return f'Generated response for {inputs}'


async def monitored_task(inputs: str) -> str:
    # Track sub-operation timing
    t0 = time.perf_counter()
    context = await retrieve_context(inputs)
    retrieve_time = time.perf_counter() - t0

    increment_eval_metric('retrieve_time', retrieve_time)

    t0 = time.perf_counter()
    result = await generate_response(context, inputs)
    generate_time = time.perf_counter() - t0

    increment_eval_metric('generate_time', generate_time)

    # Record which operations were needed
    set_eval_attribute('needed_retrieval', len(context) > 0)
    set_eval_attribute('context_chunks', len(context))

    return result


# Evaluate performance
@dataclass
class PerformanceEvaluator(Evaluator):
    max_retrieve_time: float = 0.5
    max_generate_time: float = 2.0

    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool]:
        retrieve_time = ctx.metrics.get('retrieve_time', 0.0)
        generate_time = ctx.metrics.get('generate_time', 0.0)

        return {
            'fast_retrieval': retrieve_time <= self.max_retrieve_time,
            'fast_generation': generate_time <= self.max_generate_time,
        }
```

### Quality Tracking

```python
from dataclasses import dataclass

from pydantic_evals import set_eval_attribute
from pydantic_evals.evaluators import Evaluator, EvaluatorContext


async def llm_call(inputs: str) -> dict:
    return {'text': f'Response: {inputs}', 'confidence': 0.85, 'sources': ['doc1', 'doc2']}


async def quality_task(inputs: str) -> str:
    result = await llm_call(inputs)

    # Extract quality indicators
    confidence = result.get('confidence', 0.0)
    sources_used = result.get('sources', [])

    set_eval_attribute('confidence', confidence)
    set_eval_attribute('source_count', len(sources_used))
    set_eval_attribute('sources', sources_used)

    return result['text']


# Evaluate based on quality signals
@dataclass
class QualityEvaluator(Evaluator):
    min_confidence: float = 0.7

    def evaluate(self, ctx: EvaluatorContext) -> dict[str, bool | float]:
        confidence = ctx.attributes.get('confidence', 0.0)
        source_count = ctx.attributes.get('source_count', 0)

        return {
            'high_confidence': confidence >= self.min_confidence,
            'used_sources': source_count > 0,
            'quality_score': confidence * (1.0 + 0.1 * source_count),
        }
```

## Experiment-Level Metadata

In addition to case-level metadata, you can also pass experiment-level metadata when calling [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate):

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='experiment_metadata',
    cases=[
        Case(
            inputs='test',
            metadata={'difficulty': 'easy'},  # Case-level metadata
        )
    ]
)


async def task(inputs: str) -> str:
    return f'Result: {inputs}'


# Pass experiment-level metadata
async def main():
    report = await dataset.evaluate(
        task,
        metadata={
            'model': 'gpt-5.2',
            'prompt_version': 'v2.1',
            'temperature': 0.7,
        },
    )

    # Access experiment metadata in the report
    print(report.experiment_metadata)
    #> {'model': 'gpt-5.2', 'prompt_version': 'v2.1', 'temperature': 0.7}
```

### When to Use Experiment Metadata

Experiment metadata is useful for tracking configuration that applies to the entire evaluation run:

-   **Model configuration**: Model name, version, parameters
-   **Prompt versioning**: Which prompt template was used
-   **Infrastructure**: Deployment environment, region
-   **Experiment context**: Developer name, feature branch, commit hash

This metadata is especially valuable when:

-   Comparing multiple evaluation runs over time
-   Tracking which configuration produced which results
-   Reproducing evaluation results from historical data

### Viewing in Reports

Experiment metadata appears at the top of printed reports:

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(name='metadata_report', cases=[Case(inputs='hello', expected_output='HELLO')])


async def task(text: str) -> str:
    return text.upper()

async def main():
    report = await dataset.evaluate(
        task,
        metadata={'model': 'gpt-5.2', 'version': 'v1.0'},
    )

    print(report.render())
    """
    ╭─ Evaluation Summary: task ─╮
    │ model: gpt-5.2             │
    │ version: v1.0              │
    ╰────────────────────────────╯
    ┏━━━━━━━━━━┳━━━━━━━━━━┓
    ┃ Case ID  ┃ Duration ┃
    ┡━━━━━━━━━━╇━━━━━━━━━━┩
    │ Case 1   │     10ms │
    ├──────────┼──────────┤
    │ Averages │     10ms │
    └──────────┴──────────┘
    """
```

## Synchronization between Tasks and Experiment Metadata

Experiment metadata is for _recording_ configuration, not _configuring_ the task. The metadata dict doesn't automatically configure your task's behavior; you must ensure the values in the metadata dict match what your task actually uses. For example, it's easy to accidentally have metadata claim `temperature: 0.7` while your task actually uses `temperature: 1.0`, leading to incorrect experiment tracking and unreproducible results.

To avoid this problem, we recommend establishing a single source of truth for configuration that both your task and metadata reference. Below are a few suggested patterns for achieving this synchronization.

### Pattern 1: Shared Module Constants

For simpler cases, use module-level constants:

```python
from pydantic_ai import Agent
from pydantic_evals import Case, Dataset

# Module constants as single source of truth
MODEL_NAME = 'openai:gpt-5-mini'
TEMPERATURE = 0.7
INSTRUCTIONS = 'You are a helpful assistant.'

agent = Agent(MODEL_NAME, model_settings={'temperature': TEMPERATURE}, instructions=INSTRUCTIONS)


async def task(inputs: str) -> str:
    result = await agent.run(inputs)
    return result.output


async def main():
    dataset = Dataset(name='shared_constants', cases=[Case(inputs='What is the capital of France?')])

    # Metadata references same constants
    await dataset.evaluate(
        task,
        metadata={
            'model': MODEL_NAME,
            'temperature': TEMPERATURE,
            'instructions': INSTRUCTIONS,
        },
    )
```

### Pattern 2: Configuration Object (Recommended)

Define configuration once and use it everywhere:

```python
from dataclasses import asdict, dataclass

from pydantic_ai import Agent
from pydantic_evals import Case, Dataset


@dataclass
class TaskConfig:
    """Single source of truth for task configuration.

    Includes all variables you'd like to see in experiment metadata.
    """

    model: str
    temperature: float
    max_tokens: int
    prompt_version: str


# Define configuration once
config = TaskConfig(
    model='openai:gpt-5-mini',
    temperature=0.7,
    max_tokens=500,
    prompt_version='v2.1',
)

# Use config in task
agent = Agent(
    config.model,
    model_settings={'temperature': config.temperature, 'max_tokens': config.max_tokens},
)


async def task(inputs: str) -> str:
    """Task uses the same config that's recorded in metadata."""
    result = await agent.run(inputs)
    return result.output


# Evaluate with metadata derived from the same config
async def main():
    dataset = Dataset(name='config_evaluation', cases=[Case(inputs='What is the capital of France?')])

    report = await dataset.evaluate(
        task,
        metadata=asdict(config),  # Guaranteed to match task behavior
    )

    print(report.experiment_metadata)
    """
    {
        'model': 'openai:gpt-5-mini',
        'temperature': 0.7,
        'max_tokens': 500,
        'prompt_version': 'v2.1',
    }
    """
```

If it's problematic to have a global task configuration, you can also create your `TaskConfig` object at the task call-site and pass it to the agent via `deps` or similar, but in this case you would still need to guarantee that the value is always the same as the value passed to `metadata` in the call to `Dataset.evaluate`.

### Anti-Pattern: Duplicate Configuration

**Avoid this common mistake**:

```python
from pydantic_ai import Agent
from pydantic_evals import Case, Dataset

# ❌ BAD: Configuration defined in multiple places
agent = Agent('openai:gpt-5-mini', model_settings={'temperature': 0.7})


async def task(inputs: str) -> str:
    result = await agent.run(inputs)
    return result.output


async def main():
    dataset = Dataset(name='anti_pattern', cases=[Case(inputs='test')])

    # ❌ BAD: Metadata manually typed - easy to get out of sync
    await dataset.evaluate(
        task,
        metadata={
            'model': 'openai:gpt-5-mini',  # Duplicated! Could diverge from agent definition
            'temperature': 0.8,  # ⚠️ WRONG! Task actually uses 0.7
        },
    )
```

In this anti-pattern, the metadata claims `temperature: 0.8` but the task uses `0.7`. This leads to:

-   Incorrect experiment tracking
-   Inability to reproduce results
-   Confusion when comparing runs
-   Wasted time debugging "why results differ"

## Metrics vs Attributes vs Metadata

Understanding the differences:

Feature

Metrics

Attributes

Case Metadata

Experiment Metadata

**Set in**

Task execution

Task execution

Case definition

`evaluate()` call

**Type**

int, float

Any

Any

Any

**Purpose**

Quantitative

Qualitative

Test data

Experiment config

**Used for**

Aggregation

Context

Input to task

Tracking runs

**Available to**

Evaluators

Evaluators

Task & Evaluators

Report only

**Scope**

Per case

Per case

Per case

Per experiment

```python
from pydantic_evals import Case, Dataset, increment_eval_metric, set_eval_attribute

# Case Metadata: Defined in case (before execution)
case = Case(
    inputs='question',
    metadata={'difficulty': 'hard', 'category': 'math'},  # Per-case metadata
)

dataset = Dataset(name='metrics_demo', cases=[case])


# Metrics & Attributes: Recorded during execution
async def task(inputs):
    # These are recorded during execution for each case
    increment_eval_metric('tokens', 100)
    set_eval_attribute('model', 'gpt-5.2')
    return f'Result: {inputs}'


async def main():
    # Experiment Metadata: Defined at evaluation time
    await dataset.evaluate(
        task,
        metadata={  # Experiment-level metadata
            'prompt_version': 'v2.1',
            'temperature': 0.7,
        },
    )
```

## Troubleshooting

### "Metrics/attributes not appearing"

Ensure you're calling the functions inside the task:

```python
from pydantic_evals import increment_eval_metric


def process(inputs: str) -> str:
    return f'Processed: {inputs}'


# Bad: Called outside task
increment_eval_metric('count', 1)


def bad_task(inputs):
    return process(inputs)


# Good: Called inside task
def good_task(inputs):
    increment_eval_metric('count', 1)
    return process(inputs)
```

### "Metrics not incrementing"

Check you're using `increment_eval_metric`, not `set_eval_attribute`:

```python
from pydantic_evals import increment_eval_metric, set_eval_attribute

# Bad: This will overwrite, not increment
set_eval_attribute('count', 1)
set_eval_attribute('count', 1)  # Still 1

# Good: This increments
increment_eval_metric('count', 1)
increment_eval_metric('count', 1)  # Now 2
```

### "Too much data in attributes"

Store summaries, not raw data:

```python
from pydantic_evals import set_eval_attribute

giant_response_object = {'key' + str(i): 'value' * 100 for i in range(1000)}

# Bad: Huge object
set_eval_attribute('full_response', giant_response_object)

# Good: Summary
set_eval_attribute('response_size_kb', len(str(giant_response_object)) / 1024)
set_eval_attribute('response_keys', list(giant_response_object.keys())[:10])  # First 10 keys
```

## Next Steps

-   **[Case Lifecycle Hooks](/docs/ai/evals/how-to/lifecycle)** - Per-case setup, teardown, and context preparation
-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** - Use metrics/attributes in evaluators
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - View metrics in Logfire
-   **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance

---

# [Multi-Run Evaluation](https://pydantic.dev/docs/ai/evals/how-to/multi-run/)

# Multi-Run Evaluation

Run each case multiple times to measure variability and get more reliable aggregate results.

AI systems are inherently stochastic -- the same input can produce different outputs across runs. The `repeat` parameter lets you run each case multiple times and automatically aggregates the results, giving you a clearer picture of your system's typical behavior.

## Basic Usage

Pass `repeat` to [`evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) or [`evaluate_sync()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate_sync):

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='multi_run_basic',
    cases=[
        Case(name='greeting', inputs='Say hello'),
        Case(name='farewell', inputs='Say goodbye'),
    ]
)


def task(inputs: str) -> str:
    return inputs.upper()


# Run each case 5 times
report = dataset.evaluate_sync(task, repeat=5)

# 2 cases × 5 repeats = 10 total runs
print(len(report.cases))
#> 10
```

When `repeat > 1`, each run gets an indexed name like `greeting [1/5]`, `greeting [2/5]`, etc., while the original case name is preserved in [`source_case_name`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCase.source_case_name) for grouping.

## Accessing Grouped Results

Use [`case_groups()`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.EvaluationReport.case_groups) to access runs organized by original case, with per-group aggregated statistics:

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(
    name='grouped_results',
    cases=[
        Case(name='greeting', inputs='Say hello'),
        Case(name='farewell', inputs='Say goodbye'),
    ]
)


def task(inputs: str) -> str:
    return inputs.upper()


report = dataset.evaluate_sync(task, repeat=3)

groups = report.case_groups()
assert groups is not None  # None for single-run (repeat=1)

print(len(groups))
#> 2

group_names = [g.name for g in groups]
print(group_names)
#> ['greeting', 'farewell']

# Each group has 3 runs and aggregated statistics
for group in groups:
    assert len(group.runs) == 3
    assert len(group.failures) == 0
    assert group.summary.task_duration > 0
```

Each [`ReportCaseGroup`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCaseGroup) contains:

-   `name` -- the original case name
-   `runs` -- the individual [`ReportCase`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCase) results
-   `failures` -- any runs that raised exceptions
-   `summary` -- a [`ReportCaseAggregate`](/docs/ai/api/pydantic_evals/reporting/#pydantic_evals.reporting.ReportCaseAggregate) with averaged scores, metrics, labels, assertions, and durations

## Aggregation

With `repeat > 1`, the report's `averages()` uses a two-level aggregation strategy:

1.  **Per-group averages**: Each case's runs are averaged into a group summary
2.  **Cross-group averages**: The group summaries are averaged to produce the final result

This ensures each original case contributes equally to the overall averages, regardless of how many runs succeeded or failed.

```python
from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import EqualsExpected

dataset = Dataset(
    name='aggregation',
    cases=[
        Case(name='easy', inputs='hello', expected_output='HELLO'),
        Case(name='hard', inputs='world', expected_output='WORLD'),
    ],
    evaluators=[EqualsExpected()],
)


def task(inputs: str) -> str:
    return inputs.upper()


report = dataset.evaluate_sync(task, repeat=3)

averages = report.averages()
assert averages is not None
print(f'Overall assertion rate: {averages.assertions}')
#> Overall assertion rate: 1.0
```

## Default Behavior

When `repeat=1` (the default), behavior is identical to a standard evaluation -- no run indexing, no `source_case_name`, and `case_groups()` returns `None`:

```python
from pydantic_evals import Case, Dataset

dataset = Dataset(name='default_behavior', cases=[Case(name='test', inputs='hello')])


def task(inputs: str) -> str:
    return inputs.upper()


report = dataset.evaluate_sync(task)  # repeat=1 by default

assert report.case_groups() is None
assert all(c.source_case_name is None for c in report.cases)
```

## Next Steps

-   **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** -- Control parallel execution with `max_concurrency`
-   **[Metrics & Attributes](/docs/ai/evals/how-to/metrics-attributes)** -- Track custom metrics across runs
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Visualize multi-run results

---

# [Retry Strategies](https://pydantic.dev/docs/ai/evals/how-to/retry-strategies/)

# Retry Strategies

Handle transient failures in tasks and evaluators with automatic retry logic.

LLM-based systems can experience transient failures:

-   Rate limits
-   Network timeouts
-   Temporary API outages
-   Context length errors

Pydantic Evals supports retry configuration for both:

-   **Task execution** - The function being evaluated
-   **Evaluator execution** - The evaluators themselves

## Basic Retry Configuration

Pass a retry configuration to `evaluate()` or `evaluate_sync()` using [Tenacity](https://tenacity.readthedocs.io/) parameters:

```python
from tenacity import stop_after_attempt

from pydantic_evals import Case, Dataset


def my_function(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='basic_retry', cases=[Case(inputs='test')], evaluators=[])

report = dataset.evaluate_sync(
    task=my_function,
    retry_task={'stop': stop_after_attempt(3)},
    retry_evaluators={'stop': stop_after_attempt(2)},
)
```

## Retry Configuration Options

Retry configurations use [Tenacity](https://tenacity.readthedocs.io/) and support the same options as Pydantic AI's [`RetryConfig`](/docs/ai/api/pydantic-ai/retries/#pydantic_ai.retries.RetryConfig):

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset


def my_function(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='retry_config', cases=[Case(inputs='test')], evaluators=[])

retry_config = {
    'stop': stop_after_attempt(3),  # Stop after 3 attempts
    'wait': wait_exponential(multiplier=1, min=1, max=10),  # Exponential backoff: 1s, 2s, 4s, 8s (capped at 10s)
    'reraise': True,  # Re-raise the original exception after exhausting retries
}

dataset.evaluate_sync(
    task=my_function,
    retry_task=retry_config,
)
```

### Common Parameters

The retry configuration accepts any parameters from the tenacity `retry` decorator. Common ones include:

Parameter

Type

Description

`stop`

`StopBaseT`

Stop strategy (e.g., `stop_after_attempt(3)`, `stop_after_delay(60)`)

`wait`

`WaitBaseT`

Wait strategy (e.g., `wait_exponential()`, `wait_fixed(2)`)

`retry`

`RetryBaseT`

Retry condition (e.g., `retry_if_exception_type(TimeoutError)`)

`reraise`

`bool`

Whether to reraise the original exception (default: `False`)

`before_sleep`

`Callable`

Callback before sleeping between retries

See the [Tenacity documentation](https://tenacity.readthedocs.io/) for all available options.

## Task Retries

Retry the task function when it fails:

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset


async def call_llm(inputs: str) -> str:
    return f'LLM response to: {inputs}'


async def flaky_llm_task(inputs: str) -> str:
    """This might hit rate limits or timeout."""
    response = await call_llm(inputs)
    return response


dataset = Dataset(name='task_retry', cases=[Case(inputs='test')])

report = dataset.evaluate_sync(
    task=flaky_llm_task,
    retry_task={
        'stop': stop_after_attempt(5),  # Try up to 5 times
        'wait': wait_exponential(multiplier=1, min=1, max=30),  # Exponential backoff, capped at 30s
        'reraise': True,
    },
)
```

### When Task Retries Trigger

Retries trigger when the task raises an exception:

```python
class RateLimitError(Exception):
    pass


class ValidationError(Exception):
    pass


async def call_api(inputs: str) -> str:
    return f'API response: {inputs}'


async def my_task(inputs: str) -> str:
    try:
        return await call_api(inputs)
    except RateLimitError:
        # Will trigger retry
        raise
    except ValidationError:
        # Will also trigger retry
        raise
```

### Exponential Backoff

When using `wait_exponential()`, delays increase exponentially:

```
Attempt 1: immediate
Attempt 2: ~1s delay (multiplier * 2^0)
Attempt 3: ~2s delay (multiplier * 2^1)
Attempt 4: ~4s delay (multiplier * 2^2)
Attempt 5: ~8s delay (multiplier * 2^3, capped at max)
```

The actual delay depends on the `multiplier`, `min`, and `max` parameters passed to `wait_exponential()`.

## Evaluator Retries

Retry evaluators when they fail:

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


def my_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(
    name='evaluator_retry',
    cases=[Case(inputs='test')],
    evaluators=[
        # LLMJudge might hit rate limits
        LLMJudge(rubric='Response is accurate'),
    ],
)

report = dataset.evaluate_sync(
    task=my_task,
    retry_evaluators={
        'stop': stop_after_attempt(3),
        'wait': wait_exponential(multiplier=1, min=0.5, max=10),
        'reraise': True,
    },
)
```

### When Evaluator Retries Trigger

Retries trigger when an evaluator raises an exception:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


async def external_api_call(output: str) -> bool:
    return len(output) > 0


@dataclass
class APIEvaluator(Evaluator):
    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        # If this raises an exception, retry logic will trigger
        result = await external_api_call(ctx.output)
        return result
```

### Evaluator Failures

If an evaluator fails after all retries, it's recorded as an [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure):

```python
from tenacity import stop_after_attempt

from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='evaluator_failures', cases=[Case(inputs='test')], evaluators=[])

report = dataset.evaluate_sync(task, retry_evaluators={'stop': stop_after_attempt(3)})

# Check for evaluator failures
for case in report.cases:
    if case.evaluator_failures:
        for failure in case.evaluator_failures:
            print(f'Evaluator {failure.name} failed: {failure.error_message}')
    #> (No output - no evaluator failures in this case)
```

View evaluator failures in reports:

```python
from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='failure_report', cases=[Case(inputs='test')], evaluators=[])
report = dataset.evaluate_sync(task)

report.print(include_evaluator_failures=True)
"""
  Evaluation Summary:
         task
┏━━━━━━━━━━┳━━━━━━━━━━┓
┃ Case ID  ┃ Duration ┃
┡━━━━━━━━━━╇━━━━━━━━━━┩
│ Case 1   │     10ms │
├──────────┼──────────┤
│ Averages │     10ms │
└──────────┴──────────┘
"""
#>
#> ✅ case_0                       ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% (0/0)
```

## Combining Task and Evaluator Retries

You can configure both independently:

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset


def flaky_task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='combined_retry', cases=[Case(inputs='test')], evaluators=[])

report = dataset.evaluate_sync(
    task=flaky_task,
    retry_task={
        'stop': stop_after_attempt(5),  # Retry task up to 5 times
        'wait': wait_exponential(multiplier=1, min=1, max=30),
        'reraise': True,
    },
    retry_evaluators={
        'stop': stop_after_attempt(3),  # Retry evaluators up to 3 times
        'wait': wait_exponential(multiplier=1, min=0.5, max=10),
        'reraise': True,
    },
)
```

## Practical Examples

### Rate Limit Handling

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset
from pydantic_evals.evaluators import LLMJudge


async def expensive_llm_call(inputs: str) -> str:
    return f'LLM response: {inputs}'


async def llm_task(inputs: str) -> str:
    """Task that might hit rate limits."""
    return await expensive_llm_call(inputs)


dataset = Dataset(
    name='rate_limit_retry',
    cases=[Case(inputs='test')],
    evaluators=[
        LLMJudge(rubric='Quality check'),  # Also might hit rate limits
    ],
)

# Generous retries for rate limits
report = dataset.evaluate_sync(
    task=llm_task,
    retry_task={
        'stop': stop_after_attempt(10),  # Rate limits can take multiple retries
        'wait': wait_exponential(multiplier=2, min=2, max=60),  # Start at 2s, exponential up to 60s
        'reraise': True,
    },
    retry_evaluators={
        'stop': stop_after_attempt(5),
        'wait': wait_exponential(multiplier=2, min=2, max=30),
        'reraise': True,
    },
)
```

### Network Timeout Handling

```python
import httpx
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset


async def api_task(inputs: str) -> str:
    """Task that calls external API which might timeout."""
    async with httpx.AsyncClient(timeout=10.0) as client:
        response = await client.post('https://api.example.com', json={'input': inputs})
        return response.text


dataset = Dataset(name='timeout_retry', cases=[Case(inputs='test')], evaluators=[])

# Quick retries for network issues
report = dataset.evaluate_sync(
    task=api_task,
    retry_task={
        'stop': stop_after_attempt(4),  # A few quick retries
        'wait': wait_exponential(multiplier=0.5, min=0.5, max=5),  # Fast retry, capped at 5s
        'reraise': True,
    },
)
```

### Context Length Handling

```python
from tenacity import stop_after_attempt

from pydantic_evals import Case, Dataset


class ContextLengthError(Exception):
    pass


async def llm_call(inputs: str, max_tokens: int = 8000) -> str:
    return f'LLM response: {inputs[:100]}'


async def smart_llm_task(inputs: str) -> str:
    """Task that might exceed context length."""
    try:
        return await llm_call(inputs, max_tokens=8000)
    except ContextLengthError:
        # Retry with shorter context
        truncated_inputs = inputs[:4000]
        return await llm_call(truncated_inputs, max_tokens=4000)


dataset = Dataset(name='context_length', cases=[Case(inputs='test')], evaluators=[])

# Don't retry context length errors (handle in task)
report = dataset.evaluate_sync(
    task=smart_llm_task,
    retry_task={'stop': stop_after_attempt(1)},  # No retries, we handle it
)
```

## Retry vs Error Handling

**Use retries for:**

-   Transient failures (rate limits, timeouts)
-   Network issues
-   Temporary service outages
-   Recoverable errors

**Use error handling for:**

-   Validation errors
-   Logic errors
-   Permanent failures
-   Expected error conditions

```python
class RateLimitError(Exception):
    pass


async def llm_call(inputs: str) -> str:
    return f'LLM response: {inputs}'


def is_valid(result: str) -> bool:
    return len(result) > 0


async def smart_task(inputs: str) -> str:
    """Handle expected errors, let retries handle transient failures."""
    try:
        result = await llm_call(inputs)

        # Validate output (don't retry validation errors)
        if not is_valid(result):
            return 'ERROR: Invalid output format'

        return result

    except RateLimitError:
        # Let retry logic handle this
        raise

    except ValueError as e:
        # Don't retry - this is a permanent error
        return f'ERROR: {e}'
```

## Troubleshooting

### "Still failing after retries"

Increase retry attempts or check if error is retriable:

```python
import logging

from tenacity import stop_after_attempt

from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


# Add logging to see what's failing
logging.basicConfig(level=logging.DEBUG)

dataset = Dataset(name='troubleshooting', cases=[Case(inputs='test')], evaluators=[])

# Tenacity logs retry attempts
report = dataset.evaluate_sync(task, retry_task={'stop': stop_after_attempt(5)})
```

### "Evaluations taking too long"

Reduce retry attempts or wait times:

```python
from tenacity import stop_after_attempt, wait_exponential

# Faster retries
retry_config = {
    'stop': stop_after_attempt(3),  # Fewer attempts
    'wait': wait_exponential(multiplier=0.1, min=0.1, max=2),  # Quick retries, capped at 2s
    'reraise': True,
}
```

### "Hitting rate limits despite retries"

Increase delays or use `max_concurrency`:

```python
from tenacity import stop_after_attempt, wait_exponential

from pydantic_evals import Case, Dataset


def task(inputs: str) -> str:
    return f'Result: {inputs}'


dataset = Dataset(name='rate_limit_config', cases=[Case(inputs='test')], evaluators=[])

# Longer delays
retry_config = {
    'stop': stop_after_attempt(5),
    'wait': wait_exponential(multiplier=5, min=5, max=60),  # Start at 5s, exponential up to 60s
    'reraise': True,
}

# Also reduce concurrency
report = dataset.evaluate_sync(
    task=task,
    retry_task=retry_config,
    max_concurrency=2,  # Only 2 concurrent tasks
)
```

## Next Steps

-   **[Concurrency & Performance](/docs/ai/evals/how-to/concurrency)** - Optimize evaluation performance
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** - View retries in Logfire

---

# [Online Evaluation](https://pydantic.dev/docs/ai/evals/online-evaluation/)

# Online Evaluation

Online evaluation lets you attach evaluators to production (or staging) functions so that every call (or a sampled subset) is automatically evaluated in the background. The same [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) classes used with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) work here; the difference is just in how they're wired up.

## When to Use Online Evaluation

Online evaluation is useful when you want to:

-   **Monitor production quality:** continuously score LLM outputs against rubrics
-   **Catch regressions:** detect degradation in agent behavior across deploys
-   **Collect evaluation data:** build datasets from real traffic for offline analysis
-   **Control costs:** sample expensive LLM judges on a fraction of traffic while running cheap checks on everything

For testing against curated datasets before deployment, use [offline evaluation](/docs/ai/evals/quick-start) with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate) instead.

## Quick Start

The [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator attaches evaluators to any function. Evaluators run in the background without blocking the caller, and results are emitted as [OpenTelemetry events](#default-otel-event-emission):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import evaluate


@dataclass
class OutputNotEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


@evaluate(OutputNotEmpty())
async def summarize(text: str) -> str:
    return f'Summary of: {text}'
```

Wire up OTel export (e.g. [`logfire.configure()`](/docs/ai/integrations/logfire#using-logfire)) elsewhere in your application startup so that the emitted `gen_ai.evaluation.result` events reach your backend.

Each decorated call emits one `gen_ai.evaluation.result` OTel event per evaluator result, following the [OTel GenAI evaluation semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/#event-gen_aievaluationresult). This mirrors how offline evaluation emits OTel spans via `logfire.span`: if any OTel SDK is configured in the process (via [`logfire.configure()`](/docs/ai/integrations/logfire#using-logfire), the OTel SDK directly, or a vendor instrumentation), events flow to your backend; if not, emission is a cheap no-op.

To additionally handle results in Python code -- for alerting, bespoke aggregation, in-memory test capture, or non-OTel destinations -- register a [sink](#sinks). Sinks run _in addition to_ OTel event emission.

The module-level [`configure()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.configure) and [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) functions delegate to a global [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig). For multiple configurations or isolated setups, create your own config instances (see [OnlineEvalConfig](#onlineevalconfig) below).

## Target

Each decorated function (or agent) emits results tagged with a **target** -- a name that groups results in downstream sinks and dashboards. By default the target is the decorated function's `__name__`, but you can override it with `target=...`:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import evaluate


@dataclass
class OutputNotEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


# Default: target='summarize' (function name)
@evaluate(OutputNotEmpty())
async def summarize(text: str) -> str: ...


# Override: use a friendly name
@evaluate(OutputNotEmpty(), target='customer_support')
async def run_agent(prompt: str) -> str: ...
```

The target name is supplied to sinks on every `submit()` call as a plain `str` -- a single sink instance handles any number of decorated functions or agents.

For agent capabilities, the target name is taken from the agent's own `name` attribute (see [Agent Integration](#agent-integration)); to categorize or route on agent-ness, add metadata on the config (e.g., `metadata={'kind': 'agent'}`).

## Core Concepts

### OnlineEvaluator

Different evaluators need different settings. A cheap heuristic could run on 100% of traffic; an expensive LLM judge might run on 1%. [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) wraps an [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) with per-evaluator configuration:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext, LLMJudge
from pydantic_evals.online import OnlineEvaluator


@dataclass
class IsHelpful(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return len(str(ctx.output)) > 10


# Cheap evaluator: run on every request
always_check = OnlineEvaluator(evaluator=IsHelpful(), sample_rate=1.0)

# Expensive evaluator: run on 1% of requests, limit concurrency
rare_check = OnlineEvaluator(
    evaluator=LLMJudge(rubric='Is the response helpful?'),
    sample_rate=0.01,
    max_concurrency=5,
)
```

When you pass a bare [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) to the [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator, it's automatically wrapped in an [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) with the config's default sample rate.

### OnlineEvalConfig

[`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig) holds cross-evaluator defaults (sample rate, metadata, optional additional sinks, OTel-emission toggle). There's a global default instance, plus you can create custom instances for different configurations:

```python
import asyncio
from collections.abc import Sequence
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationResult,
    Evaluator,
    EvaluatorContext,
    EvaluatorFailure,
)
from pydantic_evals.online import OnlineEvalConfig, wait_for_evaluations


@dataclass
class IsNonEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


results_log: list[str] = []


async def log_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        results_log.append(f'{r.name}={r.value}')


my_eval = OnlineEvalConfig(
    default_sink=log_sink,
    default_sample_rate=1.0,
    metadata={'service': 'my-app'},
)


@my_eval.evaluate(IsNonEmpty())
async def my_function(query: str) -> str:
    return f'Answer to: {query}'


async def main():
    result = await my_function('What is 2+2?')
    print(result)
    #> Answer to: What is 2+2?
    await wait_for_evaluations()
    print(results_log)
    #> ['IsNonEmpty=True']


asyncio.run(main())
```

### Sinks

OTel event emission is the default observability surface for online evaluation (see [Default OTel event emission](#default-otel-event-emission)). Sinks are for _additional_ handling in Python code -- in-memory test capture, alerting, fan-out to non-OTel destinations, or bespoke aggregation. [`EvaluationSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink) is the protocol; multiple sinks can be registered on a single config.

The built-in [`CallbackSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.CallbackSink) wraps any callable (sync or async) that accepts results, failures, and context. You can also pass a bare callable wherever a sink is expected -- it's auto-wrapped in a [`CallbackSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.CallbackSink).

For custom sinks, implement the [`EvaluationSink`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluationSink) protocol. Each `submit()` call receives a [`SinkPayload`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SinkPayload) bundling the results, failures, context, span reference, and target from one or more evaluators that ran for a given function call:

```python
from pydantic_evals.online import SinkPayload


class PrintSink:
    """Prints evaluation results to stdout."""

    async def submit(self, payload: SinkPayload) -> None:
        for r in payload.results:
            version = f' ({r.evaluator_version})' if r.evaluator_version else ''
            print(f'  [{payload.target}] {r.name}{version}: {r.value}')
        for f in payload.failures:
            version = f' ({f.evaluator_version})' if f.evaluator_version else ''
            print(f'  [{payload.target}] FAILED {f.name}{version}: {f.error_message}')
```

`payload.results` and `payload.failures` may cover one or more evaluators from a single function call -- when multiple evaluators share a sink, their results are batched into a single `submit()` call. Each result carries its own attribution (name, `evaluator_version` on [`EvaluationResult`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationResult) and [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure), and source spec), so sinks can separate them downstream; see [Evaluator Versioning](#evaluator-versioning). The `payload.target` identifies the function or agent being evaluated (see [Target](#target)).

### Default OTel event emission

Every dispatched evaluator emits one `gen_ai.evaluation.result` OTel log event per [`EvaluationResult`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationResult) or [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure), unconditionally -- no sink registration required. Events are parented to the span that produced them, so they appear nested under the original function call in the trace. If no OTel SDK is configured in the process, emission is a cheap no-op.

Each event has `event.name = 'gen_ai.evaluation.result'` and a short human-readable body (e.g. `evaluation: accuracy=0.87`, or `evaluation: accuracy failed: <error>`). Emission follows the [OpenTelemetry GenAI evaluation semconv](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/#event-gen_aievaluationresult), with these attributes:

-   `gen_ai.evaluation.name` -- the evaluator class name when `evaluate()` returns a scalar, or the mapping key when it returns `{'accuracy': ..., 'score': ...}`. Source: [`EvaluationResult.name`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationResult) / [`EvaluatorFailure.name`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure).
-   `gen_ai.evaluation.score.value` -- populated for `bool` (`True`→`1.0`, `False`→`0.0`) and numeric returns. Omitted for `str` returns.
-   `gen_ai.evaluation.score.label` -- populated for `bool` (`True`→`'pass'`, `False`→`'fail'`) and `str` returns (used directly as the label). Omitted for numeric returns.
-   `gen_ai.evaluation.explanation` -- `EvaluationResult.reason` on success or `EvaluatorFailure.error_message` on failure. Omitted when absent. Set via `reason=...` when constructing an `EvaluationResult` inside a custom evaluator.
-   `error.type` (failure events only) -- the exception class name (e.g. `'ValueError'`) when the failure was built from a caught exception; falls back to `'pydantic_evals.EvaluatorFailure'` for `EvaluatorFailure` instances constructed without it. Absent on successful evaluations. Source: `EvaluatorFailure.error_type`.
-   `gen_ai.evaluation.target` -- `@evaluate(target=...)` or agent `name`. See [Target](#target).
-   `gen_ai.evaluation.evaluator.version` -- `Evaluator.evaluator_version` class attribute; omitted when the class doesn't set it. See [Evaluator Versioning](#evaluator-versioning).
-   `gen_ai.evaluation.evaluator.source` -- JSON-serialized [`EvaluatorSpec`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorSpec) identifying the evaluator class and its constructor arguments, so downstream queries can group by evaluator identity without relying on `name` alone (two different `LLMJudge(rubric=...)` instances share a name but have different sources).

[OTel baggage](https://pydantic.dev/docs/logfire/reference/baggage/) entries (if any) are also attached to each event as attributes -- configurable via `include_baggage` on the config. The `gen_ai.*` and `error.type` attributes above always win on conflict with baggage.

For example, the `OutputNotEmpty` evaluator above, decorated as `@evaluate(OutputNotEmpty(), target='customer_support')` and returning `True` for a given call, emits one event with:

-   `gen_ai.evaluation.name = 'OutputNotEmpty'`
-   `gen_ai.evaluation.score.value = 1.0`
-   `gen_ai.evaluation.score.label = 'pass'`
-   `gen_ai.evaluation.target = 'customer_support'`
-   `gen_ai.evaluation.evaluator.source = '{"name":"OutputNotEmpty","arguments":null}'`

An evaluator with constructor arguments gets those rendered into `source` -- e.g. [`LLMJudge(rubric='Is the response helpful?')`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.LLMJudge) emits `gen_ai.evaluation.evaluator.source = '{"name":"LLMJudge","arguments":["Is the response helpful?"]}'`, so two `LLMJudge` instances with different rubrics remain distinguishable downstream.

Attributes under `gen_ai.evaluation.evaluator.*` are pydantic-evals extensions -- they aren't in the current OTel GenAI semconv, and their names may change to align with future semconv additions.

To disable the default emission (e.g. in a test harness that only wants to assert on a custom sink), set `emit_otel_events=False` on the config:

```python
from pydantic_evals.online import OnlineEvalConfig

config = OnlineEvalConfig(emit_otel_events=False)
```

#### Evaluator Versioning

Set `evaluator_version` as a class attribute on an [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator) subclass to stamp every result it emits with a version string -- surfaced as `gen_ai.evaluation.evaluator.version` on emitted events and as `evaluator_version` on each [`EvaluationResult`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluationResult) and [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure). This lets trend lines and dashboards filter out results produced by retired evaluator versions without deleting historical rows -- useful when you change an LLM judge's prompt or rework a heuristic in a way that invalidates prior scores:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext


@dataclass
class ToneCheck(Evaluator):
    evaluator_version = 'v2'  # bumped after prompt rewrite

    def evaluate(self, ctx: EvaluatorContext) -> str:
        return 'neutral'
```

The version applies to all results the evaluator produces (so one evaluator class maps to one version, even when the evaluator returns a mapping of named results).

## Sampling

Control evaluation frequency with per-evaluator sample rates to balance quality monitoring against cost.

Note

Sampling is decided **before** the decorated function runs. When no evaluators are sampled for a given call, the function executes without any additional instrumentation overhead (no logfire span or span tree capture).

### Static Sample Rates

A `sample_rate` between 0.0 and 1.0 sets the probability of evaluating each call:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvaluator


@dataclass
class QuickCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


# Run on every request
always = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=1.0)

# Run on 10% of requests
sometimes = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=0.1)

# Never run (effectively disabled)
never = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=0.0)
```

### Dynamic Sample Rates

Pass a callable to enable runtime-configurable or input-dependent sampling. The callable receives a [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) with the evaluator instance, function inputs, config metadata, and a per-call random seed, and returns a `float` (probability) or `bool` (always/never):

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvaluator, SamplingContext


def get_current_rate(ctx: SamplingContext) -> float:
    return 0.5


@dataclass
class QuickCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


dynamic = OnlineEvaluator(evaluator=QuickCheck(), sample_rate=get_current_rate)
```

This enables integration with feature flags, managed variables, or configuration systems -- for example, you could replace `get_current_rate` with a function that reads from a remote config service (such as [Logfire managed variables](https://logfire.pydantic.dev/docs/reference/advanced/managed-variables/)) at runtime, allowing you to change the probability without redeploying the application.

You can also use the [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) to make sampling decisions based on the function inputs:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvaluator, SamplingContext


def sample_long_inputs(ctx: SamplingContext) -> bool:
    """Only evaluate calls with long input text."""
    return len(str(ctx.inputs.get('text', ''))) > 100


@dataclass
class QualityCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return len(str(ctx.output)) > 10


expensive = OnlineEvaluator(evaluator=QualityCheck(), sample_rate=sample_long_inputs)
```

### Correlated Sampling

By default, each evaluator samples independently. With three evaluators each at 10%, roughly 27% of calls incur evaluation overhead (`1 − 0.9³`). If you'd prefer that the _same_ 10% of calls run _all_ evaluators, set `sampling_mode='correlated'`:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvalConfig, OnlineEvaluator


@dataclass
class CheckA(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


@dataclass
class CheckB(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


config = OnlineEvalConfig(
    default_sink=lambda results, failures, ctx: None,
    sampling_mode='correlated',
)

# Both run on the same ~10% of calls
check_a = OnlineEvaluator(evaluator=CheckA(), sample_rate=0.1)
check_b = OnlineEvaluator(evaluator=CheckB(), sample_rate=0.1)
```

In correlated mode, a single random `call_seed` (uniformly distributed between 0.0 and 1.0) is generated per function call and shared across all evaluators. An evaluator runs when `call_seed < sample_rate`, so lower-rate evaluators' calls are always a subset of higher-rate ones, and the total overhead probability equals the maximum rate rather than accumulating.

The `call_seed` is also available on [`SamplingContext`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.SamplingContext) for custom `sample_rate` callables that want to implement their own correlated logic regardless of mode.

### Disabling Evaluation

Use [`disable_evaluation()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.disable_evaluation) to suppress all online evaluation in a scope. This may be useful in tests:

```python
import asyncio
from collections.abc import Sequence
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationResult,
    Evaluator,
    EvaluatorContext,
    EvaluatorFailure,
)
from pydantic_evals.online import (
    OnlineEvalConfig,
    disable_evaluation,
    wait_for_evaluations,
)


@dataclass
class OutputCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


results_log: list[str] = []


async def log_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        results_log.append(f'{r.name}={r.value}')


config = OnlineEvalConfig(default_sink=log_sink)


@config.evaluate(OutputCheck())
async def my_function(x: int) -> int:
    return x * 2


async def main():
    # Evaluators suppressed inside this block
    with disable_evaluation():
        result = await my_function(21)
        print(result)
        #> 42

    await wait_for_evaluations()
    print(f'evaluations run: {len(results_log)}')
    #> evaluations run: 0

    # Evaluators resume outside the block
    await my_function(21)
    await wait_for_evaluations()
    print(f'evaluations run: {len(results_log)}')
    #> evaluations run: 1


asyncio.run(main())
```

## Conditional Evaluation

For cost control, you can run expensive evaluation logic conditionally within a single custom evaluator. Return a mapping where you only include keys for checks that have run -- checks you don't want to perform can simply be omitted from the results:

```python
import asyncio
from collections.abc import Sequence
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationResult,
    Evaluator,
    EvaluatorContext,
    EvaluatorFailure,
)
from pydantic_evals.online import (
    OnlineEvalConfig,
    wait_for_evaluations,
)

results_log: list[str] = []


async def log_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        results_log.append(f'{r.name}={r.value}')


@dataclass
class ConditionalAnalysis(Evaluator):
    """Runs a cheap check on every call, and an expensive check only on long outputs."""

    def evaluate(self, ctx: EvaluatorContext) -> dict[str, float | bool]:
        output = str(ctx.output)
        results: dict[str, float | bool] = {
            'has_content': len(output) > 0,
        }
        # Only run the expensive analysis on long outputs
        if len(output) > 20:
            # pretend the following line is expensive..
            results['detail_score'] = len(output) / 100.0
        return results


config = OnlineEvalConfig(default_sink=log_sink)


@config.evaluate(ConditionalAnalysis())
async def generate(prompt: str) -> str:
    return f'Response to: {prompt}'


async def main():
    await generate('hi')  # short output -- only cheap check runs
    await wait_for_evaluations()
    print(results_log)
    #> ['has_content=True']

    results_log.clear()
    await generate('tell me a long story about dragons')  # long output -- both checks run
    await wait_for_evaluations()
    print(sorted(results_log))
    #> ['detail_score=0.47', 'has_content=True']


asyncio.run(main())
```

This pattern lets you combine cheap and expensive checks in one evaluator, avoiding unnecessary work when conditions aren't met.

## Sync Function Support

The [`evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator works with both async and sync functions:

```python
import asyncio
from collections.abc import Sequence
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationResult,
    Evaluator,
    EvaluatorContext,
    EvaluatorFailure,
)
from pydantic_evals.online import OnlineEvalConfig, wait_for_evaluations

results_log: list[str] = []


async def log_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        results_log.append(f'{r.name}={r.value}')


@dataclass
class OutputCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


config = OnlineEvalConfig(default_sink=log_sink)


@config.evaluate(OutputCheck())
def process(text: str) -> str:
    return text.upper()


async def main():
    # Sync decorated functions work from async contexts too
    result = process('hello')
    print(result)
    #> HELLO

    await wait_for_evaluations()
    print(results_log)
    #> ['OutputCheck=True']


asyncio.run(main())
```

Sync decorated functions work from both sync and async contexts. When a running event loop is available, evaluators are dispatched as background tasks on that loop. Otherwise, a background thread with its own event loop is spawned.

## Per-Evaluator Sink Overrides

Individual evaluators can override the config's default sink. This is useful if different evaluators need to send results to different destinations:

```python
import asyncio
from collections.abc import Sequence
from dataclasses import dataclass

from pydantic_evals.evaluators import (
    EvaluationResult,
    Evaluator,
    EvaluatorContext,
    EvaluatorFailure,
)
from pydantic_evals.online import (
    OnlineEvalConfig,
    OnlineEvaluator,
    wait_for_evaluations,
)

default_log: list[str] = []
special_log: list[str] = []


async def default_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        default_log.append(r.name)


async def special_sink(
    results: Sequence[EvaluationResult],
    failures: Sequence[EvaluatorFailure],
    context: EvaluatorContext,
) -> None:
    for r in results:
        special_log.append(r.name)


@dataclass
class FastCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


@dataclass
class ImportantCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


config = OnlineEvalConfig(default_sink=default_sink)


@config.evaluate(
    FastCheck(),  # uses default sink
    OnlineEvaluator(evaluator=ImportantCheck(), sink=special_sink),  # uses special sink
)
async def my_function(x: int) -> int:
    return x


async def main():
    await my_function(42)
    await wait_for_evaluations()

    print(f'default: {default_log}')
    #> default: ['FastCheck']
    print(f'special: {special_log}')
    #> special: ['ImportantCheck']


asyncio.run(main())
```

## Re-running Evaluators from Stored Data

A key capability of online evaluation is re-running evaluators without re-executing the original function. This is useful when you want to evaluate historical data with updated rubrics, or run additional evaluators on existing traces.

### run\_evaluators

[`run_evaluators()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.run_evaluators) runs a list of evaluators against an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) and returns the results:

```python
import asyncio
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import run_evaluators
from pydantic_evals.otel.span_tree import SpanTree


@dataclass
class LengthCheck(Evaluator):
    min_length: int = 10

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return len(str(ctx.output)) >= self.min_length


@dataclass
class HasKeyword(Evaluator):
    keyword: str = 'hello'

    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return self.keyword in str(ctx.output).lower()


async def main():
    # Build a context manually (in practice, you'd get this from stored data)
    # Normally EvaluatorContext would not be manually constructed --
    # it is built automatically by the @evaluate decorator or OnlineEvaluation capability,
    # or from an EvaluatorContextSource (see below).
    ctx = EvaluatorContext(
        name='example',
        inputs={'query': 'greet the user'},
        output='Hello! How can I help you today?',
        expected_output=None,
        metadata=None,
        duration=0.5,
        _span_tree=SpanTree(),
        attributes={},
        metrics={},
    )

    results, failures = await run_evaluators(
        [LengthCheck(min_length=10), HasKeyword(keyword='hello')],
        ctx,
    )

    for r in results:
        print(f'{r.name}: {r.value}')
        #> LengthCheck: True
        #> HasKeyword: True
    print(f'failures: {len(failures)}')
    #> failures: 0


asyncio.run(main())
```

### EvaluatorContextSource Protocol

For fetching context data from external storage (like Pydantic Logfire), implement the [`EvaluatorContextSource`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.EvaluatorContextSource) protocol. It defines `fetch()` and `fetch_many()` methods that return [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) objects from stored data:

```python
import asyncio
from collections.abc import Sequence

from pydantic_evals.evaluators import EvaluatorContext
from pydantic_evals.online import SpanReference
from pydantic_evals.otel.span_tree import SpanTree


class MyContextSource:
    """Example source that fetches context from a hypothetical store."""

    def __init__(self, store: dict[str, EvaluatorContext]) -> None:
        self._store = store

    async def fetch(self, span: SpanReference) -> EvaluatorContext:
        return self._store[span.span_id]

    async def fetch_many(self, spans: Sequence[SpanReference]) -> list[EvaluatorContext]:
        return [self._store[s.span_id] for s in spans]


def _make_context(
    *,
    inputs: object = None,
    output: object = None,
    metadata: object = None,
    duration: float = 0.0,
) -> EvaluatorContext:
    # Normally EvaluatorContext would not be manually constructed --
    # it is built automatically by the @evaluate decorator or OnlineEvaluation capability.
    return EvaluatorContext(
        name=None,
        inputs=inputs,
        output=output,
        expected_output=None,
        metadata=metadata,
        duration=duration,
        _span_tree=SpanTree(),
        attributes={},
        metrics={},
    )


async def main():
    source = MyContextSource({
        'span_abc': _make_context(
            inputs={'query': 'What is AI?'},
            output='AI is artificial intelligence.',
            metadata={'model': 'gpt-4o'},
            duration=1.2,
        ),
        'span_def': _make_context(
            inputs={'query': 'What is ML?'},
            output='ML is machine learning.',
            metadata={'model': 'gpt-4o'},
            duration=0.8,
        ),
    })

    # Fetch a single context
    ctx = await source.fetch(SpanReference(trace_id='t1', span_id='span_abc'))
    print(f'inputs: {ctx.inputs}')
    #> inputs: {'query': 'What is AI?'}
    print(f'output: {ctx.output}')
    #> output: AI is artificial intelligence.

    # Fetch multiple contexts in a batch
    spans = [
        SpanReference(trace_id='t1', span_id='span_abc'),
        SpanReference(trace_id='t1', span_id='span_def'),
    ]
    contexts = await source.fetch_many(spans)
    print(f'batch size: {len(contexts)}')
    #> batch size: 2


asyncio.run(main())
```

## Concurrency Control

Each [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) has a `max_concurrency` limit (default: 10). When the limit is reached, new evaluation requests for that evaluator are **dropped** (not queued). This prevents expensive evaluators from consuming unbounded resources:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvaluator


@dataclass
class ExpensiveCheck(Evaluator):
    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        # Imagine this makes a slow call to an LLM
        return True


# Allow at most 3 concurrent evaluations
limited = OnlineEvaluator(
    evaluator=ExpensiveCheck(),
    sample_rate=0.1,
    max_concurrency=3,
)
```

To react to dropped evaluations, set `on_max_concurrency` on the [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) or as a default on [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig). The callback receives the [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) that would have been evaluated, and can be sync or async:

```python
import warnings
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnlineEvalConfig, OnlineEvaluator


@dataclass
class ExpensiveCheck(Evaluator):
    async def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


def warn_on_drop(ctx: EvaluatorContext) -> None:
    warnings.warn('Evaluation dropped due to max concurrency', stacklevel=1)


# Per-evaluator handler
limited = OnlineEvaluator(
    evaluator=ExpensiveCheck(),
    max_concurrency=3,
    on_max_concurrency=warn_on_drop,
)

# Or set a global default for all evaluators in a config
config = OnlineEvalConfig(on_max_concurrency=warn_on_drop)
```

Note

If neither the per-evaluator nor the config-level `on_max_concurrency` is set, dropped evaluations are silently ignored.

## Error Handling

There are two types of error handling:

-   **`on_sampling_error`**: Called synchronously when a `sample_rate` callable raises. Receives the exception and the [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator). Must be sync (not async). If set, the evaluator is skipped. If not set, the exception **propagates to the caller**.
-   **`on_error`**: Called when an exception occurs in a `sink` or `on_max_concurrency` callback. Receives the exception, [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext), [`Evaluator`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.Evaluator), and a [`OnErrorLocation`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnErrorLocation) string. Can be sync or async. If not set, exceptions are **silently suppressed**. The `'sink'` location is broad -- it covers both custom sink failures and the rarer default OTel event emission failures, so handlers that branch on location should treat `'sink'` as "result delivery went wrong".

Set these on [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig) for global defaults, or on [`OnlineEvaluator`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvaluator) to override per-evaluator:

```python
from dataclasses import dataclass

from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online import OnErrorLocation, OnlineEvalConfig, OnlineEvaluator


def log_errors(
    exc: Exception,
    ctx: EvaluatorContext,
    evaluator: Evaluator,
    location: OnErrorLocation,
) -> None:
    print(f'[{location}] {type(exc).__name__}: {exc}')


@dataclass
class MyCheck(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return True


# Global default -- applies to all evaluators in this config
config = OnlineEvalConfig(
    default_sink=lambda results, failures, context: None,
    on_error=log_errors,
)

# Per-evaluator override
custom = OnlineEvaluator(evaluator=MyCheck(), on_error=log_errors)
```

Key behaviors:

-   **Evaluator exceptions** are handled by converting them to [`EvaluatorFailure`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorFailure) objects passed to sinks -- they do not go through `on_error`.
-   **One evaluator's error doesn't affect siblings** -- each evaluator runs in its own task with isolated error handling.
-   **One sink's error doesn't affect other sinks** -- each sink submission is wrapped individually.
-   **If `on_error` itself raises**, the exception is silently suppressed to protect sibling evaluators.
-   **If no `on_error` is set**, exceptions are silently suppressed -- this is the safe default.

## Agent Integration

The [`OnlineEvaluation`](/docs/ai/api/pydantic_evals/online_capability/#pydantic_evals.online_capability.OnlineEvaluation) capability brings online evaluation to Pydantic AI agents. Instead of decorating a function, you add the capability to your agent. As with the `@evaluate` decorator, evaluators dispatch in the background and results are emitted as OTel events by default -- no sink registration required:

```python
from dataclasses import dataclass

from pydantic_ai import Agent
from pydantic_evals.evaluators import Evaluator, EvaluatorContext
from pydantic_evals.online_capability import OnlineEvaluation


@dataclass
class OutputNotEmpty(Evaluator):
    def evaluate(self, ctx: EvaluatorContext) -> bool:
        return bool(ctx.output)


agent = Agent(
    'openai:gpt-5.2',
    name='assistant',
    capabilities=[OnlineEvaluation(evaluators=[OutputNotEmpty()])],
)
```

The target name written to each emitted event is the agent's own `name` attribute, so events from `agent = Agent(..., name='assistant')` land under `gen_ai.evaluation.target = 'assistant'`. If the agent has no name, the target falls back to the literal string `'agent'`.

After each completed agent run, the capability:

1.  Samples evaluators based on their `sample_rate` configuration
2.  Builds an [`EvaluatorContext`](/docs/ai/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EvaluatorContext) from the run result (output, prompt, token usage, duration, span tree) -- `context.name` is populated with the agent run's `run_id`
3.  Dispatches evaluators asynchronously in the background
4.  Returns control to the caller without waiting for evaluators to finish

To attach additional sinks or override sampling defaults, pass an [`OnlineEvalConfig`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.OnlineEvalConfig) -- same as with the `@evaluate` decorator: `OnlineEvaluation(evaluators=[...], config=OnlineEvalConfig(default_sample_rate=0.1))`.

The capability supports all the same features as the [`@evaluate()`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.evaluate) decorator: sampling, per-evaluator sinks, concurrency control, and error handling. The `config` parameter is optional and defaults to the global [`DEFAULT_CONFIG`](/docs/ai/api/pydantic_evals/online/#pydantic_evals.online.DEFAULT_CONFIG).

Note

[`OnlineEvaluation`](/docs/ai/api/pydantic_evals/online_capability/#pydantic_evals.online_capability.OnlineEvaluation) wraps `agent.run()`, `agent.run_stream()`, and [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) when the run reaches a final result. For streaming runs, evaluators are dispatched only after the final result is available and the surrounding context manager exits. The same delayed-dispatch behavior applies when driving an [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) run to completion, which is generally the preferred streaming API.

## API Reference

The complete API for the `pydantic_evals.online` module is documented in the [API reference](/docs/ai/api/pydantic-evals/online).

## Next Steps

-   **[Custom Evaluators](/docs/ai/evals/evaluators/custom)** -- Write evaluators for your domain
-   **[Built-in Evaluators](/docs/ai/evals/evaluators/built-in)** -- Use ready-made evaluators
-   **[Logfire Integration](/docs/ai/evals/how-to/logfire-integration)** -- Visualize evaluation results in Logfire
-   **[Quick Start](/docs/ai/evals/quick-start)** -- Offline evaluation with [`Dataset.evaluate()`](/docs/ai/api/pydantic_evals/dataset/#pydantic_evals.dataset.Dataset.evaluate)

---

# [Agent User Interaction (AG-UI)](https://pydantic.dev/docs/ai/examples/ag-ui/)

# Agent User Interaction (AG-UI)

Example of using Pydantic AI agents with the [AG-UI Dojo](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo) example app.

See the [AG-UI docs](/docs/ai/integrations/ui/ag-ui) for more information about the AG-UI integration.

Demonstrates:

-   [AG-UI](/docs/ai/integrations/ui/ag-ui)
-   [Tools](/docs/ai/tools-toolsets/tools)

## Prerequisites

-   An [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage) you will need two command line windows.

### Pydantic AI AG-UI backend

Setup your OpenAI API Key

Terminal

```bash
export OPENAI_API_KEY=<your api key>
```

Start the Pydantic AI AG-UI example backend.

-   [pip](#tab-panel-20)
-   [uv](#tab-panel-21)

Terminal

```bash
python -m pydantic_ai_examples.ag_ui
```

Terminal

```bash
uv run -m pydantic_ai_examples.ag_ui
```

### AG-UI Dojo example frontend

Next run the AG-UI Dojo example frontend.

1.  Clone the [AG-UI repository](https://github.com/ag-ui-protocol/ag-ui)
    
    Terminal
    
    ```shell
    git clone https://github.com/ag-ui-protocol/ag-ui.git
    ```
    
2.  Change into to the `ag-ui/typescript-sdk` directory
    
    Terminal
    
    ```shell
    cd ag-ui/sdks/typescript
    ```
    
3.  Run the Dojo app following the [official instructions](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo#development-setup)
    
4.  Visit [http://localhost:3000/pydantic-ai](http://localhost:3000/pydantic-ai)
    
5.  Select View `Pydantic AI` from the sidebar
    

## Feature Examples

### Agentic Chat

This demonstrates a basic agent interaction including Pydantic AI server side tools and AG-UI client side tools.

If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/agentic\_chat](http://localhost:3000/pydantic-ai/feature/agentic_chat).

#### Agent Tools

-   `time` - Pydantic AI tool to check the current time for a time zone
-   `background` - AG-UI tool to set the background color of the client window

#### Agent Prompts

```text
What is the time in New York?
```

```text
Change the background to blue
```

A complex example which mixes both AG-UI and Pydantic AI tools:

```text
Perform the following steps, waiting for the response of each step before continuing:
1. Get the time
2. Set the background to red
3. Get the time
4. Report how long the background set took by diffing the two times
```

#### Agentic Chat - Code

agentic\_chat.py

```python
from __future__ import annotations

from datetime import datetime
from zoneinfo import ZoneInfo

from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui.app import AGUIApp

agent = Agent('openai:gpt-5-mini')


@agent.tool_plain
async def current_time(timezone: str = 'UTC') -> str:
    """Get the current time in ISO format.

    Args:
        timezone: The timezone to use.

    Returns:
        The current time in ISO format string.
    """
    tz: ZoneInfo = ZoneInfo(timezone)
    return datetime.now(tz=tz).isoformat()


app = AGUIApp(agent)
```

### Agentic Generative UI

Demonstrates a long running task where the agent sends updates to the frontend to let the user know what's happening.

If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/agentic\_generative\_ui](http://localhost:3000/pydantic-ai/feature/agentic_generative_ui).

#### Plan Prompts

```text
Create a plan for breakfast and execute it
```

#### Agentic Generative UI - Code

agentic\_generative\_ui.py

```python
from __future__ import annotations

from textwrap import dedent
from typing import Any, Literal

from pydantic import BaseModel, Field

from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent
from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui.app import AGUIApp

StepStatus = Literal['pending', 'completed']


class Step(BaseModel):
    """Represents a step in a plan."""

    description: str = Field(description='The description of the step')
    status: StepStatus = Field(
        default='pending',
        description='The status of the step (e.g., pending, completed)',
    )


class Plan(BaseModel):
    """Represents a plan with multiple steps."""

    steps: list[Step] = Field(
        default_factory=list[Step], description='The steps in the plan'
    )


class JSONPatchOp(BaseModel):
    """A class representing a JSON Patch operation (RFC 6902)."""

    op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field(
        description='The operation to perform: add, remove, replace, move, copy, or test',
    )
    path: str = Field(description='JSON Pointer (RFC 6901) to the target location')
    value: Any = Field(
        default=None,
        description='The value to apply (for add, replace operations)',
    )
    from_: str | None = Field(
        default=None,
        alias='from',
        description='Source path (for move, copy operations)',
    )


agent = Agent(
    'openai:gpt-5-mini',
    instructions=dedent(
        """
        When planning use tools only, without any other messages.
        IMPORTANT:
        - Use the `create_plan` tool to set the initial state of the steps
        - Use the `update_plan_step` tool to update the status of each step
        - Do NOT repeat the plan or summarise it in a message
        - Do NOT confirm the creation or updates in a message
        - Do NOT ask the user for additional information or next steps

        Only one plan can be active at a time, so do not call the `create_plan` tool
        again until all the steps in current plan are completed.
        """
    ),
)


@agent.tool_plain
async def create_plan(steps: list[str]) -> StateSnapshotEvent:
    """Create a plan with multiple steps.

    Args:
        steps: List of step descriptions to create the plan.

    Returns:
        StateSnapshotEvent containing the initial state of the steps.
    """
    plan: Plan = Plan(
        steps=[Step(description=step) for step in steps],
    )
    return StateSnapshotEvent(
        type=EventType.STATE_SNAPSHOT,
        snapshot=plan.model_dump(),
    )


@agent.tool_plain
async def update_plan_step(
    index: int, description: str | None = None, status: StepStatus | None = None
) -> StateDeltaEvent:
    """Update the plan with new steps or changes.

    Args:
        index: The index of the step to update.
        description: The new description for the step.
        status: The new status for the step.

    Returns:
        StateDeltaEvent containing the changes made to the plan.
    """
    changes: list[JSONPatchOp] = []
    if description is not None:
        changes.append(
            JSONPatchOp(
                op='replace', path=f'/steps/{index}/description', value=description
            )
        )
    if status is not None:
        changes.append(
            JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status)
        )
    return StateDeltaEvent(
        type=EventType.STATE_DELTA,
        delta=changes,
    )


app = AGUIApp(agent)
```

### Human in the Loop

Demonstrates simple human in the loop workflow where the agent comes up with a plan and the user can approve it using checkboxes.

#### Task Planning Tools

-   `generate_task_steps` - AG-UI tool to generate and confirm steps

#### Task Planning Prompt

```text
Generate a list of steps for cleaning a car for me to review
```

#### Human in the Loop - Code

human\_in\_the\_loop.py

```python
from __future__ import annotations

from textwrap import dedent

from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui.app import AGUIApp

agent = Agent(
    'openai:gpt-5-mini',
    instructions=dedent(
        """
        When planning tasks use tools only, without any other messages.
        IMPORTANT:
        - Use the `generate_task_steps` tool to display the suggested steps to the user
        - Never repeat the plan, or send a message detailing steps
        - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only
        - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again
        """
    ),
)

app = AGUIApp(agent)
```

### Predictive State Updates

Demonstrates how to use the predictive state updates feature to update the state of the UI based on agent responses, including user interaction via user confirmation.

If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/predictive\_state\_updates](http://localhost:3000/pydantic-ai/feature/predictive_state_updates).

#### Story Tools

-   `write_document` - AG-UI tool to write the document to a window
-   `document_predict_state` - Pydantic AI tool that enables document state prediction for the `write_document` tool

This also shows how to use custom instructions based on shared state information.

#### Story Example

Starting document text

```markdown
Bruce was a good dog,
```

Agent prompt

```text
Help me complete my story about bruce the dog, is should be no longer than a sentence.
```

#### Predictive State Updates - Code

predictive\_state\_updates.py

```python
from __future__ import annotations

from textwrap import dedent

from pydantic import BaseModel

from ag_ui.core import CustomEvent, EventType
from pydantic_ai import Agent, RunContext
from pydantic_ai.ui import StateDeps
from pydantic_ai.ui.ag_ui.app import AGUIApp


class DocumentState(BaseModel):
    """State for the document being written."""

    document: str = ''


agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[DocumentState])


# Tools which return AG-UI events will be sent to the client as part of the
# event stream, single events and iterables of events are supported.
@agent.tool_plain
async def document_predict_state() -> list[CustomEvent]:
    """Enable document state prediction.

    Returns:
        CustomEvent containing the event to enable state prediction.
    """
    return [
        CustomEvent(
            type=EventType.CUSTOM,
            name='PredictState',
            value=[
                {
                    'state_key': 'document',
                    'tool': 'write_document',
                    'tool_argument': 'document',
                },
            ],
        ),
    ]


@agent.instructions()
async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str:
    """Provide instructions for writing document if present.

    Args:
        ctx: The run context containing document state information.

    Returns:
        Instructions string for the document writing agent.
    """
    return dedent(
        f"""You are a helpful assistant for writing documents.

        Before you start writing, you MUST call the `document_predict_state`
        tool to enable state prediction.

        To present the document to the user for review, you MUST use the
        `write_document` tool.

        When you have written the document, DO NOT repeat it as a message.
        If accepted briefly summarize the changes you made, 2 sentences
        max, otherwise ask the user to clarify what they want to change.

        This is the current document:

        {ctx.deps.state.document}
        """
    )


app = AGUIApp(agent, deps=StateDeps(DocumentState()))
```

### Shared State

Demonstrates how to use the shared state between the UI and the agent.

State sent to the agent is detected by a function based instruction. This then validates the data using a custom pydantic model before using to create the instructions for the agent to follow and send to the client using a AG-UI tool.

If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/shared\_state](http://localhost:3000/pydantic-ai/feature/shared_state).

#### Recipe Tools

-   `display_recipe` - AG-UI tool to display the recipe in a graphical format

#### Recipe Example

1.  Customise the basic settings of your recipe
2.  Click `Improve with AI`

#### Shared State - Code

shared\_state.py

```python
from __future__ import annotations

from enum import Enum
from textwrap import dedent

from pydantic import BaseModel, Field

from ag_ui.core import EventType, StateSnapshotEvent
from pydantic_ai import Agent, RunContext
from pydantic_ai.ui import StateDeps
from pydantic_ai.ui.ag_ui.app import AGUIApp


class SkillLevel(str, Enum):
    """The level of skill required for the recipe."""

    BEGINNER = 'Beginner'
    INTERMEDIATE = 'Intermediate'
    ADVANCED = 'Advanced'


class SpecialPreferences(str, Enum):
    """Special preferences for the recipe."""

    HIGH_PROTEIN = 'High Protein'
    LOW_CARB = 'Low Carb'
    SPICY = 'Spicy'
    BUDGET_FRIENDLY = 'Budget-Friendly'
    ONE_POT_MEAL = 'One-Pot Meal'
    VEGETARIAN = 'Vegetarian'
    VEGAN = 'Vegan'


class CookingTime(str, Enum):
    """The cooking time of the recipe."""

    FIVE_MIN = '5 min'
    FIFTEEN_MIN = '15 min'
    THIRTY_MIN = '30 min'
    FORTY_FIVE_MIN = '45 min'
    SIXTY_PLUS_MIN = '60+ min'


class Ingredient(BaseModel):
    """A class representing an ingredient in a recipe."""

    icon: str = Field(
        default='ingredient',
        description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient",
    )
    name: str
    amount: str


class Recipe(BaseModel):
    """A class representing a recipe."""

    skill_level: SkillLevel = Field(
        default=SkillLevel.BEGINNER,
        description='The skill level required for the recipe',
    )
    special_preferences: list[SpecialPreferences] = Field(
        default_factory=list[SpecialPreferences],
        description='Any special preferences for the recipe',
    )
    cooking_time: CookingTime = Field(
        default=CookingTime.FIVE_MIN, description='The cooking time of the recipe'
    )
    ingredients: list[Ingredient] = Field(
        default_factory=list[Ingredient],
        description='Ingredients for the recipe',
    )
    instructions: list[str] = Field(
        default_factory=list[str], description='Instructions for the recipe'
    )


class RecipeSnapshot(BaseModel):
    """A class representing the state of the recipe."""

    recipe: Recipe = Field(
        default_factory=Recipe, description='The current state of the recipe'
    )


agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[RecipeSnapshot])


@agent.tool_plain
async def display_recipe(recipe: Recipe) -> StateSnapshotEvent:
    """Display the recipe to the user.

    Args:
        recipe: The recipe to display.

    Returns:
        StateSnapshotEvent containing the recipe snapshot.
    """
    return StateSnapshotEvent(
        type=EventType.STATE_SNAPSHOT,
        snapshot={'recipe': recipe},
    )


@agent.instructions
async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str:
    """Instructions for the recipe generation agent.

    Args:
        ctx: The run context containing recipe state information.

    Returns:
        Instructions string for the recipe generation agent.
    """
    return dedent(
        f"""
        You are a helpful assistant for creating recipes.

        IMPORTANT:
        - Create a complete recipe using the existing ingredients
        - Append new ingredients to the existing ones
        - Use the `display_recipe` tool to present the recipe to the user
        - Do NOT repeat the recipe in the message, use the tool instead
        - Do NOT run the `display_recipe` tool multiple times in a row

        Once you have created the updated recipe and displayed it to the user,
        summarise the changes in one sentence, don't describe the recipe in
        detail or send it as a message to the user.

        The current state of the recipe is:

        {ctx.deps.state.recipe.model_dump_json(indent=2)}
        """,
    )


app = AGUIApp(agent, deps=StateDeps(RecipeSnapshot()))
```

### Tool Based Generative UI

Demonstrates customised rendering for tool output with used confirmation.

If you've [run the example](#running-the-example), you can view it at [http://localhost:3000/pydantic-ai/feature/tool\_based\_generative\_ui](http://localhost:3000/pydantic-ai/feature/tool_based_generative_ui).

#### Haiku Tools

-   `generate_haiku` - AG-UI tool to display a haiku in English and Japanese

#### Haiku Prompt

```text
Generate a haiku about formula 1
```

#### Tool Based Generative UI - Code

tool\_based\_generative\_ui.py

```python
from __future__ import annotations

from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui.app import AGUIApp

agent = Agent('openai:gpt-5-mini')
app = AGUIApp(agent)
```

---

# [Flight Booking](https://pydantic.dev/docs/ai/examples/complex-workflows/flight-booking/)

# Flight Booking

Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent.

Demonstrates:

-   [agent delegation](/docs/ai/guides/multi-agent-applications#agent-delegation)
-   [programmatic agent hand-off](/docs/ai/guides/multi-agent-applications#programmatic-agent-hand-off)
-   [usage limits](/docs/ai/core-concepts/agent#usage-limits)

In this scenario, a group of agents work together to find the best flight for a user.

The control flow for this example can be summarised as follows:

```mermaid
graph TD
  START --> search_agent("search agent")
  search_agent --> extraction_agent("extraction agent")
  extraction_agent --> search_agent
  search_agent --> human_confirm("human confirm")
  human_confirm --> search_agent
  search_agent --> FAILED
  human_confirm --> find_seat_function("find seat function")
  find_seat_function --> human_seat_choice("human seat choice")
  human_seat_choice --> find_seat_agent("find seat agent")
  find_seat_agent --> find_seat_function
  find_seat_function --> buy_flights("buy flights")
  buy_flights --> SUCCESS
```

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-144)
-   [uv](#tab-panel-145)

Terminal

```bash
python -m pydantic_ai_examples.flight_booking
```

Terminal

```bash
uv run -m pydantic_ai_examples.flight_booking
```

## Example Code

flight\_booking.py

```python
import datetime
from dataclasses import dataclass
from typing import Literal

import logfire
from pydantic import BaseModel, Field
from rich.prompt import Prompt

from pydantic_ai import (
    Agent,
    ModelMessage,
    ModelRetry,
    RunContext,
    RunUsage,
    UsageLimits,
)

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()


class FlightDetails(BaseModel):
    """Details of the most suitable flight."""

    flight_number: str
    price: int
    origin: str = Field(description='Three-letter airport code')
    destination: str = Field(description='Three-letter airport code')
    date: datetime.date


class NoFlightFound(BaseModel):
    """When no valid flight is found."""


@dataclass
class Deps:
    web_page_text: str
    req_origin: str
    req_destination: str
    req_date: datetime.date


# This agent is responsible for controlling the flow of the conversation.
search_agent = Agent[Deps, FlightDetails | NoFlightFound](
    'openai:gpt-5.2',
    output_type=FlightDetails | NoFlightFound,  # type: ignore
    tool_retries=4,
    output_retries=4,
    system_prompt=(
        'Your job is to find the cheapest flight for the user on the given date. '
    ),
)


# This agent is responsible for extracting flight details from web page text.
extraction_agent = Agent(
    'openai:gpt-5.2',
    output_type=list[FlightDetails],
    system_prompt='Extract all the flight details from the given text.',
)


@search_agent.tool
async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]:
    """Get details of all flights."""
    # we pass the usage to the search agent so requests within this agent are counted
    result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage)
    logfire.info('found {flight_count} flights', flight_count=len(result.output))
    return result.output


@search_agent.output_validator
async def validate_output(
    ctx: RunContext[Deps], output: FlightDetails | NoFlightFound
) -> FlightDetails | NoFlightFound:
    """Procedural validation that the flight meets the constraints."""
    if isinstance(output, NoFlightFound):
        return output

    errors: list[str] = []
    if output.origin != ctx.deps.req_origin:
        errors.append(
            f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}'
        )
    if output.destination != ctx.deps.req_destination:
        errors.append(
            f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}'
        )
    if output.date != ctx.deps.req_date:
        errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}')

    if errors:
        raise ModelRetry('\n'.join(errors))
    else:
        return output


class SeatPreference(BaseModel):
    row: int = Field(ge=1, le=30)
    seat: Literal['A', 'B', 'C', 'D', 'E', 'F']


class Failed(BaseModel):
    """Unable to extract a seat selection."""


# This agent is responsible for extracting the user's seat selection
seat_preference_agent = Agent[None, SeatPreference | Failed](
    'openai:gpt-5.2',
    output_type=SeatPreference | Failed,
    system_prompt=(
        "Extract the user's seat preference. "
        'Seats A and F are window seats. '
        'Row 1 is the front row and has extra leg room. '
        'Rows 14, and 20 also have extra leg room. '
    ),
)


# in reality this would be downloaded from a booking site,
# potentially using another agent to navigate the site
flights_web_page = """
1. Flight SFO-AK123
- Price: $350
- Origin: San Francisco International Airport (SFO)
- Destination: Ted Stevens Anchorage International Airport (ANC)
- Date: January 10, 2025

2. Flight SFO-AK456
- Price: $370
- Origin: San Francisco International Airport (SFO)
- Destination: Fairbanks International Airport (FAI)
- Date: January 10, 2025

3. Flight SFO-AK789
- Price: $400
- Origin: San Francisco International Airport (SFO)
- Destination: Juneau International Airport (JNU)
- Date: January 20, 2025

4. Flight NYC-LA101
- Price: $250
- Origin: San Francisco International Airport (SFO)
- Destination: Ted Stevens Anchorage International Airport (ANC)
- Date: January 10, 2025

5. Flight CHI-MIA202
- Price: $200
- Origin: Chicago O'Hare International Airport (ORD)
- Destination: Miami International Airport (MIA)
- Date: January 12, 2025

6. Flight BOS-SEA303
- Price: $120
- Origin: Boston Logan International Airport (BOS)
- Destination: Ted Stevens Anchorage International Airport (ANC)
- Date: January 12, 2025

7. Flight DFW-DEN404
- Price: $150
- Origin: Dallas/Fort Worth International Airport (DFW)
- Destination: Denver International Airport (DEN)
- Date: January 10, 2025

8. Flight ATL-HOU505
- Price: $180
- Origin: Hartsfield-Jackson Atlanta International Airport (ATL)
- Destination: George Bush Intercontinental Airport (IAH)
- Date: January 10, 2025
"""

# restrict how many requests this app can make to the LLM
usage_limits = UsageLimits(request_limit=15)


async def main():
    deps = Deps(
        web_page_text=flights_web_page,
        req_origin='SFO',
        req_destination='ANC',
        req_date=datetime.date(2025, 1, 10),
    )
    message_history: list[ModelMessage] | None = None
    usage: RunUsage = RunUsage()
    # run the agent until a satisfactory flight is found
    while True:
        result = await search_agent.run(
            f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}',
            deps=deps,
            usage=usage,
            message_history=message_history,
            usage_limits=usage_limits,
        )
        if isinstance(result.output, NoFlightFound):
            print('No flight found')
            break
        else:
            flight = result.output
            print(f'Flight found: {flight}')
            answer = Prompt.ask(
                'Do you want to buy this flight, or keep searching? (buy/*search)',
                choices=['buy', 'search', ''],
                show_choices=False,
            )
            if answer == 'buy':
                seat = await find_seat(usage)
                await buy_tickets(flight, seat)
                break
            else:
                message_history = result.all_messages(
                    output_tool_return_content='Please suggest another flight'
                )


async def find_seat(usage: RunUsage) -> SeatPreference:
    message_history: list[ModelMessage] | None = None
    while True:
        answer = Prompt.ask('What seat would you like?')

        result = await seat_preference_agent.run(
            answer,
            message_history=message_history,
            usage=usage,
            usage_limits=usage_limits,
        )
        if isinstance(result.output, SeatPreference):
            return result.output
        else:
            print('Could not understand seat preference. Please try again.')
            message_history = result.all_messages()


async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference):
    print(f'Purchasing flight {flight_details=!r} {seat=!r}...')


if __name__ == '__main__':
    import asyncio

    asyncio.run(main())
```

---

# [Question Graph](https://pydantic.dev/docs/ai/examples/complex-workflows/question-graph/)

# Question Graph

Example of a graph for asking and evaluating questions.

Demonstrates:

-   [`pydantic_graph`](/docs/ai/graph/)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-142)
-   [uv](#tab-panel-143)

Terminal

```bash
python -m pydantic_ai_examples.question_graph
```

Terminal

```bash
uv run -m pydantic_ai_examples.question_graph
```

## Example Code

question\_graph.py

```python
from __future__ import annotations as _annotations

from dataclasses import dataclass, field
from pathlib import Path

import logfire
from groq import BaseModel

from pydantic_ai import Agent, ModelMessage, format_as_xml
from pydantic_graph import (
    BaseNode,
    End,
    Graph,
    GraphRunContext,
)
from pydantic_graph.persistence.file import FileStatePersistence

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()

ask_agent = Agent('openai:gpt-5.2', output_type=str)


@dataclass
class QuestionState:
    question: str | None = None
    ask_agent_messages: list[ModelMessage] = field(default_factory=list[ModelMessage])
    evaluate_agent_messages: list[ModelMessage] = field(
        default_factory=list[ModelMessage]
    )


@dataclass
class Ask(BaseNode[QuestionState]):
    async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer:
        result = await ask_agent.run(
            'Ask a simple question with a single correct answer.',
            message_history=ctx.state.ask_agent_messages,
        )
        ctx.state.ask_agent_messages += result.all_messages()
        ctx.state.question = result.output
        return Answer(result.output)


@dataclass
class Answer(BaseNode[QuestionState]):
    question: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate:
        answer = input(f'{self.question}: ')
        return Evaluate(answer)


class EvaluationOutput(BaseModel, use_attribute_docstrings=True):
    correct: bool
    """Whether the answer is correct."""
    comment: str
    """Comment on the answer, reprimand the user if the answer is wrong."""


evaluate_agent = Agent(
    'openai:gpt-5.2',
    output_type=EvaluationOutput,
    system_prompt='Given a question and answer, evaluate if the answer is correct.',
)


@dataclass
class Evaluate(BaseNode[QuestionState, None, str]):
    answer: str

    async def run(
        self,
        ctx: GraphRunContext[QuestionState],
    ) -> End[str] | Reprimand:
        assert ctx.state.question is not None
        result = await evaluate_agent.run(
            format_as_xml({'question': ctx.state.question, 'answer': self.answer}),
            message_history=ctx.state.evaluate_agent_messages,
        )
        ctx.state.evaluate_agent_messages += result.all_messages()
        if result.output.correct:
            return End(result.output.comment)
        else:
            return Reprimand(result.output.comment)


@dataclass
class Reprimand(BaseNode[QuestionState]):
    comment: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask:
        print(f'Comment: {self.comment}')
        ctx.state.question = None
        return Ask()


question_graph = Graph(
    nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState
)


async def run_as_continuous():
    state = QuestionState()
    node = Ask()
    end = await question_graph.run(node, state=state)
    print('END:', end.output)


async def run_as_cli(answer: str | None):
    persistence = FileStatePersistence(Path('question_graph.json'))
    persistence.set_graph_types(question_graph)

    if snapshot := await persistence.load_next():
        state = snapshot.state
        assert answer is not None, (
            'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli <answer>"'
        )
        node = Evaluate(answer)
    else:
        state = QuestionState()
        node = Ask()
    # debug(state, node)

    async with question_graph.iter(node, state=state, persistence=persistence) as run:
        while True:
            node = await run.next()
            if isinstance(node, End):
                print('END:', node.data)
                history = await persistence.load_all()
                print('history:', '\n'.join(str(e.node) for e in history), sep='\n')
                print('Finished!')
                break
            elif isinstance(node, Answer):
                print(node.question)
                break
            # otherwise just continue


if __name__ == '__main__':
    import asyncio
    import sys

    try:
        sub_command = sys.argv[1]
        assert sub_command in ('continuous', 'cli', 'mermaid')
    except (IndexError, AssertionError):
        print(
            'Usage:\n'
            '  uv run -m pydantic_ai_examples.question_graph mermaid\n'
            'or:\n'
            '  uv run -m pydantic_ai_examples.question_graph continuous\n'
            'or:\n'
            '  uv run -m pydantic_ai_examples.question_graph cli [answer]',
            file=sys.stderr,
        )
        sys.exit(1)

    if sub_command == 'mermaid':
        print(question_graph.mermaid_code(start_node=Ask))
    elif sub_command == 'continuous':
        asyncio.run(run_as_continuous())
    else:
        a = sys.argv[2] if len(sys.argv) > 2 else None
        asyncio.run(run_as_cli(a))
```

The mermaid diagram generated in this example looks like this:

```mermaid
---
title: question_graph
---
stateDiagram-v2
  [*] --> Ask
  Ask --> Answer: ask the question
  Answer --> Evaluate: answer the question
  Evaluate --> Congratulate
  Evaluate --> Castigate
  Congratulate --> [*]: success
  Castigate --> Ask: try again
```

---

# [Bank Support](https://pydantic.dev/docs/ai/examples/conversational-agents/bank-support/)

# Bank Support

Small but complete example of using Pydantic AI to build a support agent for a bank.

Demonstrates:

-   [dynamic system prompt](/docs/ai/core-concepts/agent#system-prompts)
-   [structured `output_type`](/docs/ai/core-concepts/output#structured-output)
-   [tools](/docs/ai/tools-toolsets/tools)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-146)
-   [uv](#tab-panel-147)

Terminal

```bash
python -m pydantic_ai_examples.bank_support
```

Terminal

```bash
uv run -m pydantic_ai_examples.bank_support
```

(or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`)

## Example Code

bank\_support.py

```python
import sqlite3
from dataclasses import dataclass

from pydantic import BaseModel

from pydantic_ai import Agent, RunContext


@dataclass
class DatabaseConn:
    """A wrapper over the SQLite connection."""

    sqlite_conn: sqlite3.Connection

    async def customer_name(self, *, id: int) -> str | None:
        res = cur.execute('SELECT name FROM customers WHERE id=?', (id,))
        row = res.fetchone()
        if row:
            return row[0]
        return None

    async def customer_balance(self, *, id: int) -> float:
        res = cur.execute('SELECT balance FROM customers WHERE id=?', (id,))
        row = res.fetchone()
        if row:
            return row[0]
        else:
            raise ValueError('Customer not found')


@dataclass
class SupportDependencies:
    customer_id: int
    db: DatabaseConn


class SupportOutput(BaseModel):
    support_advice: str
    """Advice returned to the customer"""
    block_card: bool
    """Whether to block their card or not"""
    risk: int
    """Risk level of query"""


support_agent = Agent(
    'openai:gpt-5.2',
    deps_type=SupportDependencies,
    output_type=SupportOutput,
    instructions=(
        'You are a support agent in our bank, give the '
        'customer support and judge the risk level of their query. '
        "Reply using the customer's name."
    ),
)


@support_agent.instructions
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
    customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
    return f"The customer's name is {customer_name!r}"


@support_agent.tool
async def customer_balance(ctx: RunContext[SupportDependencies]) -> str:
    """Returns the customer's current account balance."""
    balance = await ctx.deps.db.customer_balance(
        id=ctx.deps.customer_id,
    )
    return f'${balance:.2f}'


if __name__ == '__main__':
    with sqlite3.connect(':memory:') as con:
        cur = con.cursor()
        cur.execute('CREATE TABLE customers(id, name, balance)')
        cur.execute("""
            INSERT INTO customers VALUES
                (123, 'John', 123.45)
        """)
        con.commit()

        deps = SupportDependencies(customer_id=123, db=DatabaseConn(sqlite_conn=con))
        result = support_agent.run_sync('What is my balance?', deps=deps)
        print(result.output)
        """
        support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
        """

        result = support_agent.run_sync('I just lost my card!', deps=deps)
        print(result.output)
        """
        support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
        """
```

---

# [Chat App with FastAPI](https://pydantic.dev/docs/ai/examples/conversational-agents/chat-app/)

# Chat App with FastAPI

Simple chat app example build with FastAPI.

Demonstrates:

-   [reusing chat history](/docs/ai/core-concepts/message-history)
-   [serializing messages](/docs/ai/core-concepts/message-history#accessing-messages-from-results)
-   [streaming responses](/docs/ai/core-concepts/output#streamed-results)

This demonstrates storing chat history between requests and using it to give the model context for new responses.

Most of the complex logic here is between `chat_app.py` which streams the response to the browser, and `chat_app.ts` which renders messages in the browser.

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-148)
-   [uv](#tab-panel-149)

Terminal

```bash
python -m pydantic_ai_examples.chat_app
```

Terminal

```bash
uv run -m pydantic_ai_examples.chat_app
```

Then open the app at [localhost:8000](http://localhost:8000).

![Example conversation](/docs/ai/img/chat-app-example.png)

## Example Code

Python code that runs the chat app:

chat\_app.py

```python
from __future__ import annotations as _annotations

import asyncio
import json
import sqlite3
from collections.abc import AsyncIterator, Callable
from concurrent.futures.thread import ThreadPoolExecutor
from contextlib import asynccontextmanager
from dataclasses import dataclass
from datetime import datetime, timezone
from functools import partial
from pathlib import Path
from typing import Annotated, Any, Literal, TypeVar

import fastapi
import logfire
from fastapi import Depends, Request
from fastapi.responses import FileResponse, Response, StreamingResponse
from typing_extensions import LiteralString, ParamSpec, TypedDict

from pydantic_ai import (
    Agent,
    ModelMessage,
    ModelMessagesTypeAdapter,
    ModelRequest,
    ModelResponse,
    TextPart,
    UnexpectedModelBehavior,
    UserPromptPart,
)

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()

agent = Agent('openai:gpt-5.2')
THIS_DIR = Path(__file__).parent


@asynccontextmanager
async def lifespan(_app: fastapi.FastAPI):
    async with Database.connect() as db:
        yield {'db': db}


app = fastapi.FastAPI(lifespan=lifespan)
logfire.instrument_fastapi(app)


@app.get('/')
async def index() -> FileResponse:
    return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html')


@app.get('/chat_app.ts')
async def main_ts() -> FileResponse:
    """Get the raw typescript code, it's compiled in the browser, forgive me."""
    return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain')


async def get_db(request: Request) -> Database:
    return request.state.db


@app.get('/chat/')
async def get_chat(database: Database = Depends(get_db)) -> Response:
    msgs = await database.get_messages()
    return Response(
        b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs),
        media_type='text/plain',
    )


class ChatMessage(TypedDict):
    """Format of messages sent to the browser."""

    role: Literal['user', 'model']
    timestamp: str
    content: str


def to_chat_message(m: ModelMessage) -> ChatMessage:
    first_part = m.parts[0]
    if isinstance(m, ModelRequest):
        if isinstance(first_part, UserPromptPart):
            assert isinstance(first_part.content, str)
            return {
                'role': 'user',
                'timestamp': first_part.timestamp.isoformat(),
                'content': first_part.content,
            }
    elif isinstance(m, ModelResponse):
        if isinstance(first_part, TextPart):
            return {
                'role': 'model',
                'timestamp': m.timestamp.isoformat(),
                'content': first_part.content,
            }
    raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}')


@app.post('/chat/')
async def post_chat(
    prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db)
) -> StreamingResponse:
    async def stream_messages():
        """Streams new line delimited JSON `Message`s to the client."""
        # stream the user prompt so that can be displayed straight away
        yield (
            json.dumps(
                {
                    'role': 'user',
                    'timestamp': datetime.now(tz=timezone.utc).isoformat(),
                    'content': prompt,
                }
            ).encode('utf-8')
            + b'\n'
        )
        # get the chat history so far to pass as context to the agent
        messages = await database.get_messages()
        # run the agent with the user prompt and the chat history
        async with agent.run_stream(prompt, message_history=messages) as result:
            async for text in result.stream_output(debounce_by=0.01):
                # text here is a `str` and the frontend wants
                # JSON encoded ModelResponse, so we create one
                m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp())
                yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n'

        # add new messages (e.g. the user prompt and the agent response in this case) to the database
        await database.add_messages(result.new_messages_json())

    return StreamingResponse(stream_messages(), media_type='text/plain')


P = ParamSpec('P')
R = TypeVar('R')


@dataclass
class Database:
    """Rudimentary database to store chat messages in SQLite.

    The SQLite standard library package is synchronous, so we
    use a thread pool executor to run queries asynchronously.
    """

    con: sqlite3.Connection
    _loop: asyncio.AbstractEventLoop
    _executor: ThreadPoolExecutor

    @classmethod
    @asynccontextmanager
    async def connect(
        cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite'
    ) -> AsyncIterator[Database]:
        with logfire.span('connect to DB'):
            loop = asyncio.get_event_loop()
            executor = ThreadPoolExecutor(max_workers=1)
            con = await loop.run_in_executor(executor, cls._connect, file)
            slf = cls(con, loop, executor)
        try:
            yield slf
        finally:
            await slf._asyncify(con.close)

    @staticmethod
    def _connect(file: Path) -> sqlite3.Connection:
        con = sqlite3.connect(str(file))
        con = logfire.instrument_sqlite3(con)
        cur = con.cursor()
        cur.execute(
            'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);'
        )
        con.commit()
        return con

    async def add_messages(self, messages: bytes):
        await self._asyncify(
            self._execute,
            'INSERT INTO messages (message_list) VALUES (?);',
            messages,
            commit=True,
        )
        await self._asyncify(self.con.commit)

    async def get_messages(self) -> list[ModelMessage]:
        c = await self._asyncify(
            self._execute, 'SELECT message_list FROM messages order by id'
        )
        rows = await self._asyncify(c.fetchall)
        messages: list[ModelMessage] = []
        for row in rows:
            messages.extend(ModelMessagesTypeAdapter.validate_json(row[0]))
        return messages

    def _execute(
        self, sql: LiteralString, *args: Any, commit: bool = False
    ) -> sqlite3.Cursor:
        cur = self.con.cursor()
        cur.execute(sql, args)
        if commit:
            self.con.commit()
        return cur

    async def _asyncify(
        self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs
    ) -> R:
        return await self._loop.run_in_executor(  # type: ignore
            self._executor,
            partial(func, **kwargs),
            *args,  # type: ignore
        )


if __name__ == '__main__':
    import uvicorn

    uvicorn.run(
        'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)]
    )
```

Simple HTML page to render the app:

chat\_app.html

```html
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Chat App</title>
  <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" rel="stylesheet">
  <style>
    main {
      max-width: 700px;
    }
    #conversation .user::before {
      content: 'You asked: ';
      font-weight: bold;
      display: block;
    }
    #conversation .model::before {
      content: 'AI Response: ';
      font-weight: bold;
      display: block;
    }
    #spinner {
      opacity: 0;
      transition: opacity 500ms ease-in;
      width: 30px;
      height: 30px;
      border: 3px solid #222;
      border-bottom-color: transparent;
      border-radius: 50%;
      animation: rotation 1s linear infinite;
    }
    @keyframes rotation {
      0% { transform: rotate(0deg); }
      100% { transform: rotate(360deg); }
    }
    #spinner.active {
      opacity: 1;
    }
  </style>
</head>
<body>
  <main class="border rounded mx-auto my-5 p-4">
    <h1>Chat App</h1>
    <p>Ask me anything...</p>
    <div id="conversation" class="px-2"></div>
    <div class="d-flex justify-content-center mb-3">
      <div id="spinner"></div>
    </div>
    <form method="post">
      <input id="prompt-input" name="prompt" class="form-control"/>
      <div class="d-flex justify-content-end">
        <button class="btn btn-primary mt-2">Send</button>
      </div>
    </form>
    <div id="error" class="d-none text-danger">
      Error occurred, check the browser developer console for more information.
    </div>
  </main>
</body>
</html>
<script src="https://cdnjs.cloudflare.com/ajax/libs/typescript/5.6.3/typescript.min.js" integrity="sha512-TvPkf1JgpB7FBf8dpYVD+2FXCy+Wn4sHIIWyTQijjOOt8Z20BAbwm0Si991w2k++oXFHp5NlOfWYud/R1sJUNA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script type="module">
  // to let me write TypeScript, without adding the burden of npm we do a dirty, non-production-ready hack
  // and transpile the TypeScript code in the browser
  // this is (arguably) A neat demo trick, but not suitable for production!
  async function loadTs() {
    const response = await fetch('/chat_app.ts');
    const tsCode = await response.text();
    const jsCode = window.ts.transpile(tsCode, { target: "es2015" });
    let script = document.createElement('script');
    script.type = 'module';
    script.text = jsCode;
    document.body.appendChild(script);
  }

  loadTs().catch((e) => {
    console.error(e);
    document.getElementById('error').classList.remove('d-none');
    document.getElementById('spinner').classList.remove('active');
  });
</script>
```

TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser.

chat\_app.ts

```ts
// BIG FAT WARNING: to avoid the complexity of npm, this typescript is compiled in the browser
// there's currently no static type checking

import { marked } from 'https://cdnjs.cloudflare.com/ajax/libs/marked/15.0.0/lib/marked.esm.js'
const convElement = document.getElementById('conversation')

const promptInput = document.getElementById('prompt-input') as HTMLInputElement
const spinner = document.getElementById('spinner')

// stream the response and render messages as each chunk is received
// data is sent as newline-delimited JSON
async function onFetchResponse(response: Response): Promise<void> {
  let text = ''
  let decoder = new TextDecoder()
  if (response.ok) {
    const reader = response.body.getReader()
    while (true) {
      const {done, value} = await reader.read()
      if (done) {
        break
      }
      text += decoder.decode(value)
      addMessages(text)
      spinner.classList.remove('active')
    }
    addMessages(text)
    promptInput.disabled = false
    promptInput.focus()
  } else {
    const text = await response.text()
    console.error(`Unexpected response: ${response.status}`, {response, text})
    throw new Error(`Unexpected response: ${response.status}`)
  }
}

// The format of messages, this matches pydantic-ai both for brevity and understanding
// in production, you might not want to keep this format all the way to the frontend
interface Message {
  role: string
  content: string
  timestamp: string
}

// take raw response text and render messages into the `#conversation` element
// Message timestamp is assumed to be a unique identifier of a message, and is used to deduplicate
// hence you can send data about the same message multiple times, and it will be updated
// instead of creating a new message elements
function addMessages(responseText: string) {
  const lines = responseText.split('\n')
  const messages: Message[] = lines.filter(line => line.length > 1).map(j => JSON.parse(j))
  for (const message of messages) {
    // we use the timestamp as a crude element id
    const {timestamp, role, content} = message
    const id = `msg-${timestamp}`
    let msgDiv = document.getElementById(id)
    if (!msgDiv) {
      msgDiv = document.createElement('div')
      msgDiv.id = id
      msgDiv.title = `${role} at ${timestamp}`
      msgDiv.classList.add('border-top', 'pt-2', role)
      convElement.appendChild(msgDiv)
    }
    msgDiv.innerHTML = marked.parse(content)
  }
  window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' })
}

function onError(error: any) {
  console.error(error)
  document.getElementById('error').classList.remove('d-none')
  document.getElementById('spinner').classList.remove('active')
}

async function onSubmit(e: SubmitEvent): Promise<void> {
  e.preventDefault()
  spinner.classList.add('active')
  const body = new FormData(e.target as HTMLFormElement)

  promptInput.value = ''
  promptInput.disabled = true

  const response = await fetch('/chat/', {method: 'POST', body})
  await onFetchResponse(response)
}

// call onSubmit when the form is submitted (e.g. user clicks the send button or hits Enter)
document.querySelector('form').addEventListener('submit', (e) => onSubmit(e).catch(onError))

// load messages on page load
fetch('/chat/').then(onFetchResponse).catch(onError)
```

---

# [Data Analyst](https://pydantic.dev/docs/ai/examples/data-analytics/data-analyst/)

# Data Analyst

Sometimes in an agent workflow, the agent does not need to know the exact tool output, but still needs to process the tool output in some ways. This is especially common in data analytics: the agent needs to know that the result of a query tool is a `DataFrame` with certain named columns, but not necessarily the content of every single row.

With Pydantic AI, you can use a [dependencies object](/docs/ai/core-concepts/dependencies) to store the result from one tool and use it in another tool.

In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes).

Demonstrates:

-   [agent dependencies](/docs/ai/core-concepts/dependencies)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-154)
-   [uv](#tab-panel-155)

Terminal

```bash
python -m pydantic_ai_examples.data_analyst
```

Terminal

```bash
uv run -m pydantic_ai_examples.data_analyst
```

Output (debug):

> Based on my analysis of the Cornell Movie Review dataset (rotten\_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset).

## Example Code

data\_analyst.py

```python
from dataclasses import dataclass, field

import datasets
import duckdb
import pandas as pd

from pydantic_ai import Agent, ModelRetry, RunContext


@dataclass
class AnalystAgentDeps:
    output: dict[str, pd.DataFrame] = field(default_factory=dict[str, pd.DataFrame])

    def store(self, value: pd.DataFrame) -> str:
        """Store the output in deps and return the reference such as Out[1] to be used by the LLM."""
        ref = f'Out[{len(self.output) + 1}]'
        self.output[ref] = value
        return ref

    def get(self, ref: str) -> pd.DataFrame:
        if ref not in self.output:
            raise ModelRetry(
                f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.'
            )
        return self.output[ref]


analyst_agent = Agent(
    'openai:gpt-5.2',
    deps_type=AnalystAgentDeps,
    instructions='You are a data analyst and your job is to analyze the data according to the user request.',
)


@analyst_agent.tool
def load_dataset(
    ctx: RunContext[AnalystAgentDeps],
    path: str,
    split: str = 'train',
) -> str:
    """Load the `split` of dataset `dataset_name` from huggingface.

    Args:
        ctx: Pydantic AI agent RunContext
        path: name of the dataset in the form of `<user_name>/<dataset_name>`
        split: load the split of the dataset (default: "train")
    """
    # begin load data from hf
    builder = datasets.load_dataset_builder(path)  # pyright: ignore[reportUnknownMemberType]
    splits: dict[str, datasets.SplitInfo] = builder.info.splits or {}
    if split not in splits:
        raise ModelRetry(
            f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}'
        )

    builder.download_and_prepare()  # pyright: ignore[reportUnknownMemberType]
    dataset = builder.as_dataset(split=split)
    assert isinstance(dataset, datasets.Dataset)
    dataframe = dataset.to_pandas()
    assert isinstance(dataframe, pd.DataFrame)
    # end load data from hf

    # store the dataframe in the deps and get a ref like "Out[1]"
    ref = ctx.deps.store(dataframe)
    # construct a summary of the loaded dataset
    output = [
        f'Loaded the dataset as `{ref}`.',
        f'Description: {dataset.info.description}'
        if dataset.info.description
        else None,
        f'Features: {dataset.info.features!r}' if dataset.info.features else None,
    ]
    return '\n'.join(filter(None, output))


@analyst_agent.tool
def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str:
    """Run DuckDB SQL query on the DataFrame.

    Note that the virtual table name used in DuckDB SQL must be `dataset`.

    Args:
        ctx: Pydantic AI agent RunContext
        dataset: reference string to the DataFrame
        sql: the query to be executed using DuckDB
    """
    data = ctx.deps.get(dataset)
    result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql)
    # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe)
    ref = ctx.deps.store(result.df())
    return f'Executed SQL, result is `{ref}`'


@analyst_agent.tool
def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str:
    """Display at most 5 rows of the dataframe."""
    dataset = ctx.deps.get(name)
    return dataset.head().to_string()  # pyright: ignore[reportUnknownMemberType]


if __name__ == '__main__':
    deps = AnalystAgentDeps()
    result = analyst_agent.run_sync(
        user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`',
        deps=deps,
    )
    print(result.output)
```

## Appendix

### Choosing a Model

This example requires using a model that understands DuckDB SQL. You can check with `clai`:

Terminal

```sh
> clai -m bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0
clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0
clai ➤ do you understand duckdb sql?
# DuckDB SQL

Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database
that uses syntax similar to PostgreSQL. It specializes in analytical queries
and is designed for high-performance analysis of structured data.

Some key features of DuckDB SQL include:

 • OLAP (Online Analytical Processing) optimized
 • Columnar-vectorized query execution
 • Standard SQL support with PostgreSQL compatibility
 • Support for complex analytical queries
 • Efficient handling of CSV/Parquet/JSON files

I can help you with DuckDB SQL queries, schema design, optimization, or other
DuckDB-related questions.
```

---

# [RAG](https://pydantic.dev/docs/ai/examples/data-analytics/rag/)

# RAG

RAG search example. This demo allows you to ask question of the [logfire](https://pydantic.dev/logfire) documentation.

Demonstrates:

-   [tools](/docs/ai/tools-toolsets/tools)
-   [agent dependencies](/docs/ai/core-concepts/dependencies)
-   RAG search

This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the Pydantic AI agent.

Logic for extracting sections from markdown files and a JSON file with that data is available in [this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992).

[PostgreSQL with pgvector](https://github.com/pgvector/pgvector) is used as the search database, the easiest way to download and run pgvector is using Docker:

Terminal

```bash
mkdir postgres-data
docker run --rm \
  -e POSTGRES_PASSWORD=postgres \
  -p 54320:5432 \
  -v `pwd`/postgres-data:/var/lib/postgresql/data \
  pgvector/pgvector:pg17
```

As with the [SQL gen](/docs/ai/examples/sql-gen) example, we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL `data` directory locally to persist the data if you need to stop and restart the container.

With that running and [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), we can build the search database with (**WARNING**: this requires the `OPENAI_API_KEY` env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation):

-   [pip](#tab-panel-156)
-   [uv](#tab-panel-157)

Terminal

```bash
python -m pydantic_ai_examples.rag build
```

Terminal

```bash
uv run -m pydantic_ai_examples.rag build
```

(Note building the database doesn't use Pydantic AI right now, instead it uses the OpenAI SDK directly.)

You can then ask the agent a question with:

-   [pip](#tab-panel-158)
-   [uv](#tab-panel-159)

Terminal

```bash
python -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?"
```

Terminal

```bash
uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?"
```

## Example Code

rag.py

```python
from __future__ import annotations as _annotations

import asyncio
import re
import sys
import unicodedata
from contextlib import asynccontextmanager
from dataclasses import dataclass

import asyncpg
import httpx
import logfire
import pydantic_core
from anyio import create_task_group
from openai import AsyncOpenAI
from pydantic import TypeAdapter
from typing_extensions import AsyncGenerator

from pydantic_ai import Agent, RunContext

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_asyncpg()
logfire.instrument_pydantic_ai()


@dataclass
class Deps:
    openai: AsyncOpenAI
    pool: asyncpg.Pool


agent = Agent('openai:gpt-5.2', deps_type=Deps)


@agent.tool
async def retrieve(context: RunContext[Deps], search_query: str) -> str:
    """Retrieve documentation sections based on a search query.

    Args:
        context: The call context.
        search_query: The search query.
    """
    with logfire.span(
        'create embedding for {search_query=}', search_query=search_query
    ):
        embedding = await context.deps.openai.embeddings.create(
            input=search_query,
            model='text-embedding-3-small',
        )

    assert len(embedding.data) == 1, (
        f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}'
    )
    embedding = embedding.data[0].embedding
    embedding_json = pydantic_core.to_json(embedding).decode()
    rows = await context.deps.pool.fetch(
        'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8',
        embedding_json,
    )
    return '\n\n'.join(
        f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n'
        for row in rows
    )


async def run_agent(question: str):
    """Entry point to run the agent and perform RAG based question answering."""
    openai = AsyncOpenAI()
    logfire.instrument_openai(openai)

    logfire.info('Asking "{question}"', question=question)

    async with database_connect(False) as pool:
        deps = Deps(openai=openai, pool=pool)
        answer = await agent.run(question, deps=deps)
    print(answer.output)


#######################################################
# The rest of this file is dedicated to preparing the #
# search database, and some utilities.                #
#######################################################

# JSON document from
# https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992
DOCS_JSON = (
    'https://gist.githubusercontent.com/'
    'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/'
    '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json'
)


async def build_search_db():
    """Build the search database."""
    async with httpx.AsyncClient() as client:
        response = await client.get(DOCS_JSON)
        response.raise_for_status()
    sections = sections_ta.validate_json(response.content)

    openai = AsyncOpenAI()
    logfire.instrument_openai(openai)

    async with database_connect(True) as pool:
        with logfire.span('create schema'):
            async with pool.acquire() as conn:
                async with conn.transaction():
                    await conn.execute(DB_SCHEMA)

        sem = asyncio.Semaphore(10)
        async with create_task_group() as tg:
            for section in sections:
                tg.start_soon(insert_doc_section, sem, openai, pool, section)


async def insert_doc_section(
    sem: asyncio.Semaphore,
    openai: AsyncOpenAI,
    pool: asyncpg.Pool,
    section: DocsSection,
) -> None:
    async with sem:
        url = section.url()
        exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url)
        if exists:
            logfire.info('Skipping {url=}', url=url)
            return

        with logfire.span('create embedding for {url=}', url=url):
            embedding = await openai.embeddings.create(
                input=section.embedding_content(),
                model='text-embedding-3-small',
            )
        assert len(embedding.data) == 1, (
            f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}'
        )
        embedding = embedding.data[0].embedding
        embedding_json = pydantic_core.to_json(embedding).decode()
        await pool.execute(
            'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)',
            url,
            section.title,
            section.content,
            embedding_json,
        )


@dataclass
class DocsSection:
    id: int
    parent: int | None
    path: str
    level: int
    title: str
    content: str

    def url(self) -> str:
        url_path = re.sub(r'\.md$', '', self.path)
        return (
            f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}'
        )

    def embedding_content(self) -> str:
        return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content))


sections_ta = TypeAdapter(list[DocsSection])


# pyright: reportUnknownMemberType=false
# pyright: reportUnknownVariableType=false
@asynccontextmanager
async def database_connect(
    create_db: bool = False,
) -> AsyncGenerator[asyncpg.Pool, None]:
    server_dsn, database = (
        'postgresql://postgres:postgres@localhost:54320',
        'pydantic_ai_rag',
    )
    if create_db:
        with logfire.span('check and create DB'):
            conn = await asyncpg.connect(server_dsn)
            try:
                db_exists = await conn.fetchval(
                    'SELECT 1 FROM pg_database WHERE datname = $1', database
                )
                if not db_exists:
                    await conn.execute(f'CREATE DATABASE {database}')
            finally:
                await conn.close()

    pool = await asyncpg.create_pool(f'{server_dsn}/{database}')
    try:
        yield pool
    finally:
        await pool.close()


DB_SCHEMA = """
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS doc_sections (
    id serial PRIMARY KEY,
    url text NOT NULL UNIQUE,
    title text NOT NULL,
    content text NOT NULL,
    -- text-embedding-3-small returns a vector of 1536 floats
    embedding vector(1536) NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops);
"""


def slugify(value: str, separator: str, unicode: bool = False) -> str:
    """Slugify a string, to make it URL friendly."""
    # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38
    if not unicode:
        # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty`
        value = unicodedata.normalize('NFKD', value)
        value = value.encode('ascii', 'ignore').decode('ascii')
    value = re.sub(r'[^\w\s-]', '', value).strip().lower()
    return re.sub(rf'[{separator}\s]+', separator, value)


if __name__ == '__main__':
    action = sys.argv[1] if len(sys.argv) > 1 else None
    if action == 'build':
        asyncio.run(build_search_db())
    elif action == 'search':
        if len(sys.argv) == 3:
            q = sys.argv[2]
        else:
            q = 'How do I configure logfire to work with FastAPI?'
        asyncio.run(run_agent(q))
    else:
        print(
            'uv run --extra examples -m pydantic_ai_examples.rag build|search',
            file=sys.stderr,
        )
        sys.exit(1)
```

---

# [SQL Generation](https://pydantic.dev/docs/ai/examples/data-analytics/sql-gen/)

# SQL Generation

Example demonstrating how to use Pydantic AI to generate SQL queries based on user input.

Demonstrates:

-   [dynamic system prompt](/docs/ai/core-concepts/agent#system-prompts)
-   [structured `output_type`](/docs/ai/core-concepts/output#structured-output)
-   [output validation](/docs/ai/core-concepts/output#output-validator-functions)
-   [agent dependencies](/docs/ai/core-concepts/dependencies)

## Running the Example

The resulting SQL is validated by running it as an `EXPLAIN` query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker:

Terminal

```bash
docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres
```

_(we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running)_

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-150)
-   [uv](#tab-panel-151)

Terminal

```bash
python -m pydantic_ai_examples.sql_gen
```

Terminal

```bash
uv run -m pydantic_ai_examples.sql_gen
```

or to use a custom prompt:

-   [pip](#tab-panel-152)
-   [uv](#tab-panel-153)

Terminal

```bash
python -m pydantic_ai_examples.sql_gen "find me errors"
```

Terminal

```bash
uv run -m pydantic_ai_examples.sql_gen "find me errors"
```

This model uses `gemini-3-flash-preview` by default since Gemini is good at single shot queries of this kind.

## Example Code

sql\_gen.py

```py
import asyncio
import sys
from collections.abc import AsyncGenerator
from contextlib import asynccontextmanager
from dataclasses import dataclass
from datetime import date
from typing import Annotated, Any, TypeAlias

import asyncpg
import logfire
from annotated_types import MinLen
from devtools import debug
from pydantic import BaseModel, Field

from pydantic_ai import Agent, ModelRetry, RunContext, format_as_xml

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_asyncpg()
logfire.instrument_pydantic_ai()

DB_SCHEMA = """
CREATE TABLE records (
    created_at timestamptz,
    start_timestamp timestamptz,
    end_timestamp timestamptz,
    trace_id text,
    span_id text,
    parent_span_id text,
    level log_level,
    span_name text,
    message text,
    attributes_json_schema text,
    attributes jsonb,
    tags text[],
    is_exception boolean,
    otel_status_message text,
    service_name text
);
"""
SQL_EXAMPLES = [
    {
        'request': 'show me records where foobar is false',
        'response': "SELECT * FROM records WHERE attributes->>'foobar' = false",
    },
    {
        'request': 'show me records where attributes include the key "foobar"',
        'response': "SELECT * FROM records WHERE attributes ? 'foobar'",
    },
    {
        'request': 'show me records from yesterday',
        'response': "SELECT * FROM records WHERE start_timestamp::date > CURRENT_TIMESTAMP - INTERVAL '1 day'",
    },
    {
        'request': 'show me error records with the tag "foobar"',
        'response': "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)",
    },
]


@dataclass
class Deps:
    conn: asyncpg.Connection


class Success(BaseModel):
    """Response when SQL could be successfully generated."""

    sql_query: Annotated[str, MinLen(1)]
    explanation: str = Field(
        '', description='Explanation of the SQL query, as markdown'
    )


class InvalidRequest(BaseModel):
    """Response the user input didn't include enough information to generate SQL."""

    error_message: str


Response: TypeAlias = Success | InvalidRequest
agent = Agent[Deps, Response](
    'google-gla:gemini-3-flash-preview',
    # Type ignore while we wait for PEP-0747, nonetheless unions will work fine everywhere else
    output_type=Response,  # type: ignore
    deps_type=Deps,
)


@agent.system_prompt
async def system_prompt() -> str:
    return f"""\
Given the following PostgreSQL table of records, your job is to
write a SQL query that suits the user's request.

Database schema:

{DB_SCHEMA}

today's date = {date.today()}

{format_as_xml(SQL_EXAMPLES)}
"""


@agent.output_validator
async def validate_output(ctx: RunContext[Deps], output: Response) -> Response:
    if isinstance(output, InvalidRequest):
        return output

    # gemini often adds extraneous backslashes to SQL
    output.sql_query = output.sql_query.replace('\\', '')
    if not output.sql_query.upper().startswith('SELECT'):
        raise ModelRetry('Please create a SELECT query')

    try:
        await ctx.deps.conn.execute(f'EXPLAIN {output.sql_query}')
    except asyncpg.exceptions.PostgresError as e:
        raise ModelRetry(f'Invalid query: {e}') from e
    else:
        return output


async def main():
    if len(sys.argv) == 1:
        prompt = 'show me logs from yesterday, with level "error"'
    else:
        prompt = sys.argv[1]

    async with database_connect(
        'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_sql_gen'
    ) as conn:
        deps = Deps(conn)
        result = await agent.run(prompt, deps=deps)
    debug(result.output)


# pyright: reportUnknownMemberType=false
# pyright: reportUnknownVariableType=false
@asynccontextmanager
async def database_connect(server_dsn: str, database: str) -> AsyncGenerator[Any, None]:
    with logfire.span('check and create DB'):
        conn = await asyncpg.connect(server_dsn)
        try:
            db_exists = await conn.fetchval(
                'SELECT 1 FROM pg_database WHERE datname = $1', database
            )
            if not db_exists:
                await conn.execute(f'CREATE DATABASE {database}')
        finally:
            await conn.close()

    conn = await asyncpg.connect(f'{server_dsn}/{database}')
    try:
        with logfire.span('create schema'):
            async with conn.transaction():
                if not db_exists:
                    await conn.execute(
                        "CREATE TYPE log_level AS ENUM ('debug', 'info', 'warning', 'error', 'critical')"
                    )
                    await conn.execute(DB_SCHEMA)
        yield conn
    finally:
        await conn.close()


if __name__ == '__main__':
    asyncio.run(main())
```

---

# [Pydantic Model](https://pydantic.dev/docs/ai/examples/getting-started/pydantic-model/)

# Pydantic Model

Simple example of using Pydantic AI to construct a Pydantic model from a text input.

Demonstrates:

-   [structured `output_type`](/docs/ai/core-concepts/output#structured-output)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-160)
-   [uv](#tab-panel-161)

Terminal

```bash
python -m pydantic_ai_examples.pydantic_model
```

Terminal

```bash
uv run -m pydantic_ai_examples.pydantic_model
```

This examples uses `openai:gpt-5` by default, but it works well with other models, e.g. you can run it with Gemini using:

-   [pip](#tab-panel-162)
-   [uv](#tab-panel-163)

Terminal

```bash
PYDANTIC_AI_MODEL=gemini-3-pro-preview python -m pydantic_ai_examples.pydantic_model
```

Terminal

```bash
PYDANTIC_AI_MODEL=gemini-3-pro-preview uv run -m pydantic_ai_examples.pydantic_model
```

(or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`)

## Example Code

pydantic\_model.py

```py
import os

import logfire
from pydantic import BaseModel

from pydantic_ai import Agent

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()


class MyModel(BaseModel):
    city: str
    country: str


model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-5.2')
print(f'Using model: {model}')
agent = Agent(model, output_type=MyModel)

if __name__ == '__main__':
    result = agent.run_sync('The windy city in the US of A.')
    print(result.output)
    print(result.usage())
```

---

# [Weather Agent](https://pydantic.dev/docs/ai/examples/getting-started/weather-agent/)

# Weather Agent

Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question.

Demonstrates:

-   [tools](/docs/ai/tools-toolsets/tools)
-   [agent dependencies](/docs/ai/core-concepts/dependencies)
-   [streaming text responses](/docs/ai/core-concepts/output#streaming-text)
-   Building a [Gradio](https://www.gradio.app/) UI for the agent

In this case the idea is a "weather" agent -- the user can ask for the weather in multiple locations, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather for those locations.

## Running the Example

To run this example properly, you might want to add two extra API keys **(Note if either key is missing, the code will fall back to dummy data, so they're not required)**:

-   A weather API key from [tomorrow.io](https://www.tomorrow.io/weather-api/) set via `WEATHER_API_KEY`
-   A geocoding API key from [geocode.maps.co](https://geocode.maps.co/) set via `GEO_API_KEY`

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-164)
-   [uv](#tab-panel-165)

Terminal

```bash
python -m pydantic_ai_examples.weather_agent
```

Terminal

```bash
uv run -m pydantic_ai_examples.weather_agent
```

## Example Code

weather\_agent.py

```python
from __future__ import annotations as _annotations

import asyncio
from dataclasses import dataclass
from typing import Any

import logfire
from httpx import AsyncClient
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()


@dataclass
class Deps:
    client: AsyncClient


weather_agent = Agent(
    'openai:gpt-5-mini',
    # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use
    # the below tools appropriately, but others like anthropic and gemini require a bit more direction.
    instructions='Be concise, reply with one sentence.',
    deps_type=Deps,
    tool_retries=2,
    output_retries=2,
)


class LatLng(BaseModel):
    lat: float
    lng: float


@weather_agent.tool
async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng:
    """Get the latitude and longitude of a location.

    Args:
        ctx: The context.
        location_description: A description of a location.
    """
    # NOTE: the response here will be random, and is not related to the location description.
    r = await ctx.deps.client.get(
        'https://demo-endpoints.pydantic.workers.dev/latlng',
        params={'location': location_description},
    )
    r.raise_for_status()
    return LatLng.model_validate_json(r.content)


@weather_agent.tool
async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]:
    """Get the weather at a location.

    Args:
        ctx: The context.
        lat: Latitude of the location.
        lng: Longitude of the location.
    """
    # NOTE: the responses here will be random, and are not related to the lat and lng.
    temp_response, descr_response = await asyncio.gather(
        ctx.deps.client.get(
            'https://demo-endpoints.pydantic.workers.dev/number',
            params={'min': 10, 'max': 30},
        ),
        ctx.deps.client.get(
            'https://demo-endpoints.pydantic.workers.dev/weather',
            params={'lat': lat, 'lng': lng},
        ),
    )
    temp_response.raise_for_status()
    descr_response.raise_for_status()
    return {
        'temperature': f'{temp_response.text} °C',
        'description': descr_response.text,
    }


async def main():
    async with AsyncClient() as client:
        logfire.instrument_httpx(client, capture_all=True)
        deps = Deps(client=client)
        result = await weather_agent.run(
            'What is the weather like in London and in Wiltshire?', deps=deps
        )
        print('Response:', result.output)


if __name__ == '__main__':
    asyncio.run(main())
```

## Running the UI

You can build multi-turn chat applications for your agent with [Gradio](https://www.gradio.app/), a framework for building AI web applications entirely in python. Gradio comes with built-in chat components and agent support so the entire UI will be implemented in a single python file!

Here's what the UI looks like for the weather agent:

Terminal

```bash
pip install gradio>=6.7.0
python/uv-run -m pydantic_ai_examples.weather_agent_gradio
```

## UI Code

weather\_agent\_gradio.py

```py
from __future__ import annotations as _annotations

import json

from httpx import AsyncClient
from pydantic import BaseModel

from pydantic_ai import ToolCallPart, ToolReturnPart
from pydantic_ai_examples.weather_agent import Deps, weather_agent

try:
    import gradio as gr
except ImportError as e:
    raise ImportError(
        'Please install gradio with `pip install gradio`. You must use python>=3.10.'
    ) from e

TOOL_TO_DISPLAY_NAME = {'get_lat_lng': 'Geocoding API', 'get_weather': 'Weather API'}

client = AsyncClient()
deps = Deps(client=client)


async def stream_from_agent(prompt: str, chatbot: list[dict], past_messages: list):
    chatbot.append({'role': 'user', 'content': prompt})
    yield gr.Textbox(interactive=False, value=''), chatbot, gr.skip()
    async with weather_agent.run_stream(
        prompt, deps=deps, message_history=past_messages
    ) as result:
        for message in result.new_messages():
            for call in message.parts:
                if isinstance(call, ToolCallPart):
                    call_args = call.args_as_json_str()
                    metadata = {
                        'title': f'🛠️ Using {TOOL_TO_DISPLAY_NAME[call.tool_name]}',
                    }
                    if call.tool_call_id is not None:
                        metadata['id'] = call.tool_call_id

                    gr_message = {
                        'role': 'assistant',
                        'content': 'Parameters: ' + call_args,
                        'metadata': metadata,
                    }
                    chatbot.append(gr_message)
                if isinstance(call, ToolReturnPart):
                    for gr_message in chatbot:
                        if (
                            gr_message.get('metadata', {}).get('id', '')
                            == call.tool_call_id
                        ):
                            if isinstance(call.content, BaseModel):
                                json_content = call.content.model_dump_json()
                            else:
                                json_content = json.dumps(call.content)
                            gr_message['content'] += f'\nOutput: {json_content}'
                yield gr.skip(), chatbot, gr.skip()
        chatbot.append({'role': 'assistant', 'content': ''})
        async for message in result.stream_text():
            chatbot[-1]['content'] = message
            yield gr.skip(), chatbot, gr.skip()
        past_messages = result.all_messages()

        yield gr.Textbox(interactive=True), gr.skip(), past_messages


async def handle_retry(chatbot, past_messages: list, retry_data: gr.RetryData):
    new_history = chatbot[: retry_data.index]
    previous_prompt = chatbot[retry_data.index]['content']
    past_messages = past_messages[: retry_data.index]
    async for update in stream_from_agent(previous_prompt, new_history, past_messages):
        yield update


def undo(chatbot, past_messages: list, undo_data: gr.UndoData):
    new_history = chatbot[: undo_data.index]
    past_messages = past_messages[: undo_data.index]
    return chatbot[undo_data.index]['content'], new_history, past_messages


def select_data(message: gr.SelectData) -> str:
    return message.value['text']


with gr.Blocks() as demo:
    gr.HTML(
        """
<div style="display: flex; justify-content: center; align-items: center; gap: 2rem; padding: 1rem; width: 100%">
    <img src="https://ai.pydantic.dev/img/logo-white.svg" style="max-width: 200px; height: auto">
    <div>
        <h1 style="margin: 0 0 1rem 0">Weather Assistant</h1>
        <h3 style="margin: 0 0 0.5rem 0">
            This assistant answer your weather questions.
        </h3>
    </div>
</div>
"""
    )
    past_messages = gr.State([])
    chatbot = gr.Chatbot(
        label='Packing Assistant',
        avatar_images=(None, 'https://ai.pydantic.dev/img/logo-white.svg'),
        examples=[
            {'text': 'What is the weather like in Miami?'},
            {'text': 'What is the weather like in London?'},
        ],
    )
    with gr.Row():
        prompt = gr.Textbox(
            lines=1,
            show_label=False,
            placeholder='What is the weather like in New York City?',
        )
    generation = prompt.submit(
        stream_from_agent,
        inputs=[prompt, chatbot, past_messages],
        outputs=[prompt, chatbot, past_messages],
    )
    chatbot.example_select(select_data, None, [prompt])
    chatbot.retry(
        handle_retry, [chatbot, past_messages], [prompt, chatbot, past_messages]
    )
    chatbot.undo(undo, [chatbot, past_messages], [prompt, chatbot, past_messages])


if __name__ == '__main__':
    demo.launch()
```

---

# [Setup](https://pydantic.dev/docs/ai/examples/setup/)

# Setup

Here we include some examples of how to use Pydantic AI and what it can do.

## Usage

These examples are distributed with `pydantic-ai` so you can run them either by cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by simply installing `pydantic-ai` from PyPI with `pip` or `uv`.

### Installing required dependencies

Either way you'll need to install extra dependencies to run some examples, you just need to install the `examples` optional dependency group.

If you've installed `pydantic-ai` via pip/uv, you can install the extra dependencies with:

-   [pip](#tab-panel-10)
-   [uv](#tab-panel-11)

Terminal

```bash
pip install "pydantic-ai[examples]"
```

Terminal

```bash
uv add "pydantic-ai[examples]"
```

If you clone the repo, you should instead use `uv sync --extra examples` to install extra dependencies.

### Setting model environment variables

These examples will need you to set up authentication with one or more of the LLMs, see the [model configuration](/docs/ai/models/overview) docs for details on how to do this.

TL;DR: in most cases you'll need to set one of the following environment variables:

-   [OpenAI](#tab-panel-12)
-   [Google Gemini](#tab-panel-13)

Terminal

```bash
export OPENAI_API_KEY=your-api-key
```

Terminal

```bash
export GEMINI_API_KEY=your-api-key
```

### Running Examples

To run the examples (this will work whether you installed `pydantic_ai`, or cloned the repo), run:

-   [pip](#tab-panel-14)
-   [uv](#tab-panel-15)

Terminal

```bash
python -m pydantic_ai_examples.<example_module_name>
```

Terminal

```bash
uv run -m pydantic_ai_examples.<example_module_name>
```

For example, to run the very simple [`pydantic_model`](/docs/ai/examples/pydantic-model) example:

-   [pip](#tab-panel-16)
-   [uv](#tab-panel-17)

Terminal

```bash
python -m pydantic_ai_examples.pydantic_model
```

Terminal

```bash
uv run -m pydantic_ai_examples.pydantic_model
```

If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup:

Terminal

```bash
OPENAI_API_KEY='your-api-key' \
  uv run --with "pydantic-ai[examples]" \
  -m pydantic_ai_examples.pydantic_model
```

* * *

You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with:

-   [pip](#tab-panel-18)
-   [uv](#tab-panel-19)

Terminal

```bash
python -m pydantic_ai_examples --copy-to examples/
```

Terminal

```bash
uv run -m pydantic_ai_examples --copy-to examples/
```

---

# [Slack Lead Qualifier with Modal](https://pydantic.dev/docs/ai/examples/slack-lead-qualifier/)

# Slack Lead Qualifier with Modal

In this example, we're going to build an agentic app that:

-   automatically researches each new member that joins a company's public Slack community to see how good of a fit they are for the company's commercial product,
-   sends this analysis into a (private) Slack channel, and
-   sends a daily summary of the top 5 leads from the previous 24 hours into a (different) Slack channel.

We'll be deploying the app on [Modal](https://modal.com), as it lets you use Python to define an app with web endpoints, scheduled functions, and background functions, and deploy them with a CLI, without needing to set up or manage any infrastructure. It's a great way to lower the barrier for people in your organization to start building and deploying AI agents to make their jobs easier.

We also add [Pydantic Logfire](https://pydantic.dev/logfire) to get observability into the app and agent as they're running in response to webhooks and the schedule

## Screenshots

This is what the analysis sent into Slack will look like:

![Slack message](/docs/ai/img/slack-lead-qualifier-slack.png)

This is what the corresponding trace in [Logfire](https://pydantic.dev/logfire) will look like:

![Logfire trace](/docs/ai/img/slack-lead-qualifier-logfire.png)

All of these entries can be clicked on to get more details about what happened at that step, including the full conversation with the LLM and HTTP requests and responses.

## Prerequisites

If you just want to see the code without actually going through the effort of setting up the bits necessary to run it, feel free to [jump ahead](#the-code).

### Slack app

You need to have a Slack workspace and the necessary permissions to create apps.

2.  Create a new Slack app using the instructions at [https://docs.slack.dev/quickstart](https://docs.slack.dev/quickstart).
    
    1.  In step 2, "Requesting scopes", request the following scopes:
        -   [`users.read`](https://docs.slack.dev/reference/scopes/users.read)
        -   [`users.read.email`](https://docs.slack.dev/reference/scopes/users.read.email)
        -   [`users.profile.read`](https://docs.slack.dev/reference/scopes/users.profile.read)
    2.  In step 3, "Installing and authorizing the app", note down the Access Token as we're going to need to store it as a Secret in Modal.
    3.  You can skip steps 4 and 5. We're going to need to subscribe to the `team_join` event, but at this point you don't have a webhook URL yet.
3.  Create the channels the app will post into, and add the Slack app to them:
    
    -   `#new-slack-leads`
    -   `#daily-slack-leads-summary`
    
    These names are hard-coded in the example. If you want to use different channels, you can clone the repo and change them in `examples/pydantic_ai_examples/slack_lead_qualifier/functions.py`.
    

### Logfire Write Token

1.  If you don't have a Logfire account yet, create one on [https://logfire-us.pydantic.dev/](https://logfire-us.pydantic.dev/).
2.  Create a new project named, for example, `slack-lead-qualifier`.
3.  Generate a new Write Token and note it down, as we're going to need to store it as a Secret in Modal.

### OpenAI API Key

1.  If you don't have an OpenAI account yet, create one on [https://platform.openai.com/](https://platform.openai.com/).
2.  Create a new API Key in Settings and note it down, as we're going to need to store it as a Secret in Modal.

### Modal account

1.  If you don't have a Modal account yet, create one on [https://modal.com/signup](https://modal.com/signup).
2.  Create 3 Secrets of type "Custom" on [https://modal.com/secrets](https://modal.com/secrets):
    -   Name: `slack`, key: `SLACK_API_KEY`, value: the Slack Access Token you generated earlier
    -   Name: `logfire`, key: `LOGFIRE_TOKEN`, value: the Logfire Write Token you generated earlier
    -   Name: `openai`, key: `OPENAI_API_KEY`, value: the OpenAI API Key you generated earlier

## Usage

1.  Make sure you have the [dependencies installed](/docs/ai/examples/setup#usage).
    
2.  Authenticate with Modal:
    
    Terminal
    
    ```bash
    python/uv-run -m modal setup
    ```
    
3.  Run the example as an [ephemeral Modal app](https://modal.com/docs/guide/apps#ephemeral-apps), meaning it will only run until you quit it using Ctrl+C:
    
    Terminal
    
    ```bash
    python/uv-run -m modal serve -m pydantic_ai_examples.slack_lead_qualifier.modal
    ```
    
4.  Note down the URL after `Created web function web_app =>`, this is your webhook endpoint URL.
    
5.  Go back to [https://docs.slack.dev/quickstart](https://docs.slack.dev/quickstart) and follow step 4, "Configuring the app for event listening", to subscribe to the `team_join` event with the webhook endpoint URL you noted down as the Request URL.
    

Now when someone new (possibly you with a throwaway email) joins the Slack workspace, you'll see the webhook event being processed in the terminal where you ran `modal serve` and in the Logfire Live view, and after waiting a few seconds you should see the result appear in the `#new-slack-leads` Slack channel!

Faking a Slack signup

You can also fake a Slack signup event and try out the agent like this, with any name or email you please:

Terminal

```bash
curl -X POST <webhook endpoint URL> \
-H "Content-Type: application/json" \
-d '{
    "type": "event_callback",
    "event": {
        "type": "team_join",
        "user": {
            "profile": {
                "email": "samuel@pydantic.dev",
                "first_name": "Samuel",
                "last_name": "Colvin",
                "display_name": "Samuel Colvin"
            }
        }
    }
}'
```

Deploying to production

If you'd like to deploy this app into your Modal workspace in a persistent fashion, you can use this command:

Terminal

```bash
python/uv-run -m modal deploy -m pydantic_ai_examples.slack_lead_qualifier.modal
```

You'll likely want to [download the code](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/slack_lead_qualifier) first, put it in a new repo, and then do [continuous deployment](https://modal.com/docs/guide/continuous-deployment#github-actions) using GitHub Actions.

Don't forget to update the Slack event request URL to the new persistent URL! You'll also want to modify the [instructions for the agent](#agent) to your own situation.

## The code

We're going to start with the basics, and then gradually build up into the full app.

### Models

#### `Profile`

First, we define a [Pydantic](https://docs.pydantic.dev) model that represents a Slack user profile. These are the fields we get from the [`team_join`](https://docs.slack.dev/reference/events/team_join) event that's sent to the webhook endpoint that we'll define in a bit.

models.py

```py
...

class Profile(BaseModel):
    first_name: str | None = None
    last_name: str | None = None
    display_name: str | None = None
    email: str

...
```

We also define a `Profile.as_prompt()` helper method that uses [`format_as_xml`](/docs/ai/api/pydantic-ai/format_prompt/#pydantic_ai.format_prompt.format_as_xml) to turn the profile into a string that can be sent to the model.

models.py

```py
...

from pydantic_ai import format_as_xml

...

class Profile(BaseModel):

...

    def as_prompt(self) -> str:
        return format_as_xml(self, root_tag='profile')

...
```

#### `Analysis`

The second model we'll need represents the result of the analysis that the agent will perform. We include docstrings to provide additional context to the model on what these fields should contain.

models.py

```py
...

class Analysis(BaseModel):
    profile: Profile
    organization_name: str
    organization_domain: str
    job_title: str
    relevance: Annotated[int, Ge(1), Le(5)]
    """Estimated fit for Pydantic Logfire: 1 = low, 5 = high"""
    summary: str
    """One-sentence welcome note summarising who they are and how we might help"""

...
```

We also define a `Analysis.as_slack_blocks()` helper method that turns the analysis into some [Slack blocks](https://api.slack.com/reference/block-kit/blocks) that can be sent to the Slack API to post a new message.

models.py

```py
...

class Analysis(BaseModel):

...

    def as_slack_blocks(self, include_relevance: bool = False) -> list[dict[str, Any]]:
        profile = self.profile
        relevance = f'({self.relevance}/5)' if include_relevance else ''
        return [
            {
                'type': 'markdown',
                'text': f'[{profile.display_name}](mailto:{profile.email}), {self.job_title} at [**{self.organization_name}**](https://{self.organization_domain}) {relevance}',
            },
            {
                'type': 'markdown',
                'text': self.summary,
            },
        ]

...
```

### Agent

Now it's time to get into Pydantic AI and define the agent that will do the actual analysis!

We specify the model we'll use (`openai:gpt-5`), provide [instructions](/docs/ai/core-concepts/agent#instructions), give the agent access to the [DuckDuckGo search tool](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool), and tell it to output either an `Analysis` or `None` using the [Native Output](/docs/ai/core-concepts/output#native-output) structured output mode.

The real meat of the app is in the instructions that tell the agent how to evaluate each new Slack member. If you plan to use this app yourself, you'll of course want to modify them to your own situation.

agent.py

```python
...

from pydantic_ai import Agent, NativeOutput
from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool

...

agent = Agent(
    'openai:gpt-5.2',
    instructions=dedent(
        """
        When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them.

        **What to include**

        1. **Who they are:**  Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio).
        2. **Where they work:**  Name of the organisation and its domain.
        3. **How we can help:**  On a scale of 1-5, estimate how likely they are to benefit from **Pydantic Logfire**
           (our paid observability tool) based on factors such as company size, product maturity, or AI usage.
           *1 = probably not relevant, 5 = very strong fit.*

        **Our products (for context only)**
        • **Pydantic Validation** - Python data-validation (open source)
        • **Pydantic AI** - Python agent framework (open source)
        • **Pydantic Logfire** - Observability for traces, logs & metrics with first-class AI support (commercial)

        **How to research**

        • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub.
        • If you can't find enough to form a reasonable view, return **None**.
        """
    ),
    tools=[duckduckgo_search_tool()],
    output_type=NativeOutput([Analysis, NoneType]),
)

...
```

#### `analyze_profile`

We also define a `analyze_profile` helper function that takes a `Profile`, runs the agent, and returns an `Analysis` (or `None`), and instrument it using [Logfire](/docs/ai/integrations/logfire).

agent.py

```py
...

@logfire.instrument('Analyze profile')
async def analyze_profile(profile: Profile) -> Analysis | None:
    result = await agent.run(profile.as_prompt())
    return result.output

...
```

### Analysis store

The next building block we'll need is a place to store all the analyses that have been done so that we can look them up when we send the daily summary.

Fortunately, Modal provides us with a convenient way to store some data that can be read back in a subsequent Modal run (webhook or scheduled): [`modal.Dict`](https://modal.com/docs/reference/modal.Dict).

We define some convenience methods to easily add, list, and clear analyses.

store.py

```py
...

import modal

...

class AnalysisStore:
    @classmethod
    @logfire.instrument('Add analysis to store')
    async def add(cls, analysis: Analysis):
        await cls._get_store().put.aio(analysis.profile.email, analysis.model_dump())

    @classmethod
    @logfire.instrument('List analyses from store')
    async def list(cls) -> list[Analysis]:
        return [
            Analysis.model_validate(analysis)
            async for analysis in cls._get_store().values.aio()
        ]

    @classmethod
    @logfire.instrument('Clear analyses from store')
    async def clear(cls):
        await cls._get_store().clear.aio()

    @classmethod
    def _get_store(cls) -> modal.Dict:
        return modal.Dict.from_name('analyses', create_if_missing=True)  # type: ignore

...
```

Note

Note that `# type: ignore` on the last line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining.

### Send Slack message

Next, we'll need a way to actually send a Slack message, so we define a simple function that uses Slack's [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API.

slack.py

```py
...

API_KEY = os.getenv('SLACK_API_KEY')
assert API_KEY, 'SLACK_API_KEY is not set'


@logfire.instrument('Send Slack message')
async def send_slack_message(channel: str, blocks: list[dict[str, Any]]):
    client = httpx.AsyncClient()
    response = await client.post(
        'https://slack.com/api/chat.postMessage',
        json={
            'channel': channel,
            'blocks': blocks,
        },
        headers={
            'Authorization': f'Bearer {API_KEY}',
        },
        timeout=5,
    )
    response.raise_for_status()
    result = response.json()
    if not result.get('ok', False):
        error = result.get('error', 'Unknown error')
        raise Exception(f'Failed to send to Slack: {error}')

...
```

### Features

Now we can start putting these building blocks together to implement the actual features we want!

#### `process_slack_member`

This function takes a [`Profile`](#profile), [analyzes](#analyze_profile) it using the agent, adds it to the [`AnalysisStore`](#analysis-store), and [sends](#send-slack-message) the analysis into the `#new-slack-leads` channel.

functions.py

```py
...

from .agent import analyze_profile
from .models import Profile

from .slack import send_slack_message
from .store import AnalysisStore

...

NEW_LEAD_CHANNEL = '#new-slack-leads'

...

@logfire.instrument('Process Slack member')
async def process_slack_member(profile: Profile):
    analysis = await analyze_profile(profile)
    logfire.info('Analysis', analysis=analysis)

    if analysis is None:
        return

    await AnalysisStore().add(analysis)

    await send_slack_message(
        NEW_LEAD_CHANNEL,
        [
            {
                'type': 'header',
                'text': {
                    'type': 'plain_text',
                    'text': f'New Slack member with score {analysis.relevance}/5',
                },
            },
            {
                'type': 'divider',
            },
            *analysis.as_slack_blocks(),
        ],
    )

...
```

#### `send_daily_summary`

This function list all of the analyses in the [`AnalysisStore`](#analysis-store), takes the top 5 by relevance, [sends](#send-slack-message) them into the `#daily-slack-leads-summary` channel, and clears the `AnalysisStore` so that the next daily run won't process these analyses again.

functions.py

```py
...

from .slack import send_slack_message
from .store import AnalysisStore

...

DAILY_SUMMARY_CHANNEL = '#daily-slack-leads-summary'

...

@logfire.instrument('Send daily summary')
async def send_daily_summary():
    analyses = await AnalysisStore().list()
    logfire.info('Analyses', analyses=analyses)

    if len(analyses) == 0:
        return

    sorted_analyses = sorted(analyses, key=lambda x: x.relevance, reverse=True)
    top_analyses = sorted_analyses[:5]

    blocks = [
        {
            'type': 'header',
            'text': {
                'type': 'plain_text',
                'text': f'Top {len(top_analyses)} new Slack members from the last 24 hours',
            },
        },
    ]

    for analysis in top_analyses:
        blocks.extend(
            [
                {
                    'type': 'divider',
                },
                *analysis.as_slack_blocks(include_relevance=True),
            ]
        )

    await send_slack_message(
        DAILY_SUMMARY_CHANNEL,
        blocks,
    )

    await AnalysisStore().clear()

...
```

### Web app

As it stands, neither of these functions are actually being called from anywhere.

Let's implement a [FastAPI](https://fastapi.tiangolo.com/) endpoint to handle the `team_join` Slack webhook (also known as the [Slack Events API](https://docs.slack.dev/apis/events-api)) and call the [`process_slack_member`](#process_slack_member) function we just defined. We also instrument FastAPI using Logfire for good measure.

app.py

```py
...

app = FastAPI()
logfire.instrument_fastapi(app, capture_headers=True)


@app.post('/')
async def process_webhook(payload: dict[str, Any]) -> dict[str, Any]:
    if payload['type'] == 'url_verification':
        return {'challenge': payload['challenge']}
    elif (
        payload['type'] == 'event_callback' and payload['event']['type'] == 'team_join'
    ):
        profile = Profile.model_validate(payload['event']['user']['profile'])

        process_slack_member(profile)
        return {'status': 'OK'}

    raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY)

...
```

#### `process_slack_member` with Modal

I was a little sneaky there -- we're not actually calling the [`process_slack_member`](#process_slack_member) function we defined in `functions.py` directly, as Slack requires webhooks to respond within 3 seconds, and we need a bit more time than that to talk to the LLM, do some web searches, and send the Slack message.

Instead, we're calling the following function defined alongside the app, which uses Modal's [`modal.Function.spawn`](https://modal.com/docs/reference/modal.Function#spawn) feature to run a function in the background. (If you're curious what the Modal side of this function looks like, you can [jump ahead](#backgrounded-process_slack_member).)

Because `modal.py` (which we'll see in the next section) imports `app.py`, we import from `modal.py` inside the function definition because doing so at the top level would have resulted in a circular import error.

We also pass along the current Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), meaning that the background function execution will show up nested under the webhook request trace, so that we have everything related to that request in one place.

app.py

```py
...

def process_slack_member(profile: Profile):
    from .modal import process_slack_member as _process_slack_member

    _process_slack_member.spawn(
        profile.model_dump(), logfire_ctx=get_context()
    )

...
```

### Modal app

Now let's see how easy Modal makes it to deploy all of this.

#### Set up Modal

The first thing we do is define the Modal app, by specifying the base image to use (Debian with Python 3.13), all the Python packages it needs, and all of the secrets defined in the Modal interface that need to be made available during runtime.

modal.py

```py
...

import modal

image = modal.Image.debian_slim(python_version='3.13').pip_install(
    'pydantic',
    'pydantic_ai_slim[openai,duckduckgo]',
    'logfire[httpx,fastapi]',
    'fastapi[standard]',
    'httpx',
)
app = modal.App(
    name='slack-lead-qualifier',
    image=image,
    secrets=[
        modal.Secret.from_name('logfire'),
        modal.Secret.from_name('openai'),
        modal.Secret.from_name('slack'),
    ],
)

...
```

#### Set up Logfire

Next, we define a function to set up Logfire instrumentation for Pydantic AI and HTTPX.

We cannot do this at the top level of the file, as the requested packages (like `logfire`) will only be available within functions running on Modal (like the ones we'll define next). This file, `modal.py`, runs on your local machine and only has access to the `modal` package.

modal.py

```py
...

def setup_logfire():
    import logfire

    logfire.configure(service_name=app.name)
    logfire.instrument_pydantic_ai()
    logfire.instrument_httpx(capture_all=True)

...
```

#### Web app

To deploy a [web endpoint](https://modal.com/docs/guide/webhooks) on Modal, we simply define a function that returns an ASGI app (like FastAPI) and decorate it with `@app.function()` and `@modal.asgi_app()`.

This `web_app` function will be run on Modal, so inside the function we can call the `setup_logfire` function that requires the `logfire` package, and import `app.py` which uses the other requested packages.

By default, Modal spins up a container to handle a function call (like a web request) on-demand, meaning there's a little bit of startup time to each request. However, Slack requires webhooks to respond within 3 seconds, so we specify `min_containers=1` to keep the web endpoint running and ready to answer requests at all times. This is a bit annoying and wasteful, but fortunately [Modal's pricing](https://modal.com/pricing) is pretty reasonable, you get $30 free monthly compute, and they offer up to $50k in free credits for startup and academic researchers.

modal.py

```py
...

@app.function(min_containers=1)
@modal.asgi_app()  # type: ignore
def web_app():
    setup_logfire()

    from .app import app as _app

    return _app

...
```

Note

Note that `# type: ignore` on the `@modal.asgi_app()` line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining.

#### Scheduled `send_daily_summary`

To define a [scheduled function](https://modal.com/docs/guide/cron), we can use the `@app.function()` decorator with a `schedule` argument. This Modal function will call our imported [`send_daily_summary`](#send_daily_summary) function every day at 8 am UTC.

modal.py

```py
...

@app.function(schedule=modal.Cron('0 8 * * *'))  # Every day at 8am UTC
async def send_daily_summary():
    setup_logfire()

    from .functions import send_daily_summary as _send_daily_summary

    await _send_daily_summary()

...
```

#### Backgrounded `process_slack_member`

Finally, we define a Modal function that wraps our [`process_slack_member`](#process_slack_member) function, so that it can run in the background.

As you'll remember from when we [spawned this function from the web app](#process_slack_member-with-modal), we passed along the Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), so we need to attach it here.

modal.py

```py
...

@app.function()
async def process_slack_member(profile_raw: dict[str, Any], logfire_ctx: Any):
    setup_logfire()

    from logfire.propagate import attach_context

    from .functions import process_slack_member as _process_slack_member
    from .models import Profile

    with attach_context(logfire_ctx):
        profile = Profile.model_validate(profile_raw)
        await _process_slack_member(profile)

...
```

## Conclusion

And that's it! Now, assuming you've met the [prerequisites](#prerequisites), you can run or deploy the app using the commands under [usage](#usage).

---

# [Stream Markdown](https://pydantic.dev/docs/ai/examples/streaming/stream-markdown/)

# Stream Markdown

This example shows how to stream markdown from an agent, using the [`rich`](https://github.com/Textualize/rich) library to highlight the output in the terminal.

It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set.

Demonstrates:

-   [streaming text responses](/docs/ai/core-concepts/output#streaming-text)

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-166)
-   [uv](#tab-panel-167)

Terminal

```bash
python -m pydantic_ai_examples.stream_markdown
```

Terminal

```bash
uv run -m pydantic_ai_examples.stream_markdown
```

## Example Code

stream\_markdown.py

```python
import asyncio
import os

import logfire
from rich.console import Console, ConsoleOptions, RenderResult
from rich.live import Live
from rich.markdown import CodeBlock, Markdown
from rich.syntax import Syntax
from rich.text import Text

from pydantic_ai import Agent
from pydantic_ai.models import KnownModelName

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()

agent = Agent()

# models to try, and the appropriate env var
models: list[tuple[KnownModelName, str]] = [
    ('google-gla:gemini-3-flash-preview', 'GEMINI_API_KEY'),
    ('openai:gpt-5-mini', 'OPENAI_API_KEY'),
    ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'),
]


async def main():
    prettier_code_blocks()
    console = Console()
    prompt = 'Show me a short example of using Pydantic.'
    console.log(f'Asking: {prompt}...', style='cyan')
    for model, env_var in models:
        if env_var in os.environ:
            console.log(f'Using model: {model}')
            with Live('', console=console, vertical_overflow='visible') as live:
                async with agent.run_stream(prompt, model=model) as result:
                    async for message in result.stream_output():
                        live.update(Markdown(message))
            console.log(result.usage())
        else:
            console.log(f'{model} requires {env_var} to be set.')


def prettier_code_blocks():
    """Make rich code blocks prettier and easier to copy.

    From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22
    """

    class SimpleCodeBlock(CodeBlock):
        def __rich_console__(
            self, console: Console, options: ConsoleOptions
        ) -> RenderResult:
            code = str(self.text).rstrip()
            yield Text(self.lexer_name, style='dim')
            yield Syntax(
                code,
                self.lexer_name,
                theme=self.theme,
                background_color='default',
                word_wrap=True,
            )
            yield Text(f'/{self.lexer_name}', style='dim')

    Markdown.elements['fence'] = SimpleCodeBlock


if __name__ == '__main__':
    asyncio.run(main())
```

---

# [Stream Whales](https://pydantic.dev/docs/ai/examples/streaming/stream-whales/)

# Stream Whales

Information about whales -- an example of streamed structured response validation.

Demonstrates:

-   [streaming structured output](/docs/ai/core-concepts/output#streaming-structured-output)

This script streams structured responses about whales, validates the data and displays it as a dynamic table using [`rich`](https://github.com/Textualize/rich) as the data is received.

## Running the Example

With [dependencies installed and environment variables set](/docs/ai/examples/setup#usage), run:

-   [pip](#tab-panel-168)
-   [uv](#tab-panel-169)

Terminal

```bash
python -m pydantic_ai_examples.stream_whales
```

Terminal

```bash
uv run -m pydantic_ai_examples.stream_whales
```

Should give an output like this:

## Example Code

stream\_whales.py

```python
from typing import Annotated

import logfire
from pydantic import Field
from rich.console import Console
from rich.live import Live
from rich.table import Table
from typing_extensions import NotRequired, TypedDict

from pydantic_ai import Agent

# 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured
logfire.configure(send_to_logfire='if-token-present')
logfire.instrument_pydantic_ai()


class Whale(TypedDict):
    name: str
    length: Annotated[
        float, Field(description='Average length of an adult whale in meters.')
    ]
    weight: NotRequired[
        Annotated[
            float,
            Field(description='Average weight of an adult whale in kilograms.', ge=50),
        ]
    ]
    ocean: NotRequired[str]
    description: NotRequired[Annotated[str, Field(description='Short Description')]]


agent = Agent('openai:gpt-5.2', output_type=list[Whale])


async def main():
    console = Console()
    with Live('\n' * 36, console=console) as live:
        console.print('Requesting data...', style='cyan')
        async with agent.run_stream(
            'Generate me details of 5 species of Whale.'
        ) as result:
            console.print('Response:', style='green')

            async for whales in result.stream_output(debounce_by=0.01):
                table = Table(
                    title='Species of Whale',
                    caption='Streaming Structured responses from OpenAI',
                    width=120,
                )
                table.add_column('ID', justify='right')
                table.add_column('Name')
                table.add_column('Avg. Length (m)', justify='right')
                table.add_column('Avg. Weight (kg)', justify='right')
                table.add_column('Ocean')
                table.add_column('Description', justify='right')

                for wid, whale in enumerate(whales, start=1):
                    table.add_row(
                        str(wid),
                        whale['name'],
                        f'{whale["length"]:0.0f}',
                        f'{w:0.0f}' if (w := whale.get('weight')) else '...',
                        whale.get('ocean') or '...',
                        whale.get('description') or '...',
                    )
                live.update(table)


if __name__ == '__main__':
    import asyncio

    asyncio.run(main())
```

---

# [Decisions](https://pydantic.dev/docs/ai/graph/beta/decisions/)

# Decisions

Decision nodes enable conditional branching in your graph based on the type or value of data flowing through it.

A decision node evaluates incoming data and routes it to different branches based on:

-   Type matching (using `isinstance`)
-   Literal value matching
-   Custom predicate functions

The first matching branch is taken, similar to pattern matching or `if-elif-else` chains.

## Creating Decisions

Use [`g.decision()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.decision) to create a decision node, then add branches with [`g.match()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.match):

simple\_decision.py

```python
from dataclasses import dataclass
from typing import Literal

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    path_taken: str | None = None


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def choose_path(ctx: StepContext[DecisionState, None, None]) -> Literal['left', 'right']:
        return 'left'

    @g.step
    async def left_path(ctx: StepContext[DecisionState, None, object]) -> str:
        ctx.state.path_taken = 'left'
        return 'Went left'

    @g.step
    async def right_path(ctx: StepContext[DecisionState, None, object]) -> str:
        ctx.state.path_taken = 'right'
        return 'Went right'

    g.add(
        g.edge_from(g.start_node).to(choose_path),
        g.edge_from(choose_path).to(
            g.decision()
            .branch(g.match(TypeExpression[Literal['left']]).to(left_path))
            .branch(g.match(TypeExpression[Literal['right']]).to(right_path))
        ),
        g.edge_from(left_path, right_path).to(g.end_node),
    )

    graph = g.build()
    state = DecisionState()
    result = await graph.run(state=state)
    print(result)
    #> Went left
    print(state.path_taken)
    #> left
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Type Matching

Match by type using regular Python types:

type\_matching.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def return_int(ctx: StepContext[DecisionState, None, None]) -> int:
        return 42

    @g.step
    async def handle_int(ctx: StepContext[DecisionState, None, int]) -> str:
        return f'Got int: {ctx.inputs}'

    @g.step
    async def handle_str(ctx: StepContext[DecisionState, None, str]) -> str:
        return f'Got str: {ctx.inputs}'

    g.add(
        g.edge_from(g.start_node).to(return_int),
        g.edge_from(return_int).to(
            g.decision()
            .branch(g.match(int).to(handle_int))
            .branch(g.match(str).to(handle_str))
        ),
        g.edge_from(handle_int, handle_str).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Got int: 42
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### Matching Union Types

For more complex type expressions like unions, you need to use [`TypeExpression`](/docs/ai/api/pydantic_graph/beta/#pydantic_graph.beta.TypeExpression) because Python's type system doesn't allow union types to be used directly as runtime values:

union\_type\_matching.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def return_value(ctx: StepContext[DecisionState, None, None]) -> int | str:
        """Returns either an int or a str."""
        return 42

    @g.step
    async def handle_number(ctx: StepContext[DecisionState, None, int | float]) -> str:
        return f'Got number: {ctx.inputs}'

    @g.step
    async def handle_text(ctx: StepContext[DecisionState, None, str]) -> str:
        return f'Got text: {ctx.inputs}'

    g.add(
        g.edge_from(g.start_node).to(return_value),
        g.edge_from(return_value).to(
            g.decision()
            # Use TypeExpression for union types
            .branch(g.match(TypeExpression[int | float]).to(handle_number))
            .branch(g.match(str).to(handle_text))
        ),
        g.edge_from(handle_number, handle_text).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Got number: 42
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

Note

[`TypeExpression`](/docs/ai/api/pydantic_graph/beta/#pydantic_graph.beta.TypeExpression) is only necessary for complex type expressions like unions (`int | str`), `Literal`, and other type forms that aren't valid as runtime `type` objects. For simple types like `int`, `str`, or custom classes, you can pass them directly to `g.match()`.

The `TypeForm` class introduced in [PEP 747](https://peps.python.org/pep-0747/) should eventually eliminate the need for this workaround.

## Custom Matchers

Provide custom matching logic with the `matches` parameter:

custom\_matcher.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def return_number(ctx: StepContext[DecisionState, None, None]) -> int:
        return 7

    @g.step
    async def even_path(ctx: StepContext[DecisionState, None, int]) -> str:
        return f'{ctx.inputs} is even'

    @g.step
    async def odd_path(ctx: StepContext[DecisionState, None, int]) -> str:
        return f'{ctx.inputs} is odd'

    g.add(
        g.edge_from(g.start_node).to(return_number),
        g.edge_from(return_number).to(
            g.decision()
            .branch(g.match(TypeExpression[int], matches=lambda x: x % 2 == 0).to(even_path))
            .branch(g.match(TypeExpression[int], matches=lambda x: x % 2 == 1).to(odd_path))
        ),
        g.edge_from(even_path, odd_path).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> 7 is odd
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Branch Priority

Branches are evaluated in the order they're added. The first matching branch is taken:

branch\_priority.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def return_value(ctx: StepContext[DecisionState, None, None]) -> int:
        return 10

    @g.step
    async def branch_a(ctx: StepContext[DecisionState, None, int]) -> str:
        return 'Branch A'

    @g.step
    async def branch_b(ctx: StepContext[DecisionState, None, int]) -> str:
        return 'Branch B'

    g.add(
        g.edge_from(g.start_node).to(return_value),
        g.edge_from(return_value).to(
            g.decision()
            .branch(g.match(TypeExpression[int], matches=lambda x: x >= 5).to(branch_a))
            .branch(g.match(TypeExpression[int], matches=lambda x: x >= 0).to(branch_b))
        ),
        g.edge_from(branch_a, branch_b).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Branch A
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

Both branches could match `10`, but Branch A is first, so it's taken.

## Catch-All Branches

Use `object` or `Any` to create a catch-all branch:

catch\_all.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def return_value(ctx: StepContext[DecisionState, None, None]) -> int:
        return 100

    @g.step
    async def catch_all(ctx: StepContext[DecisionState, None, object]) -> str:
        return f'Caught: {ctx.inputs}'

    g.add(
        g.edge_from(g.start_node).to(return_value),
        g.edge_from(return_value).to(g.decision().branch(g.match(TypeExpression[object]).to(catch_all))),
        g.edge_from(catch_all).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Caught: 100
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Nested Decisions

Decisions can be nested for complex conditional logic:

nested\_decisions.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def get_number(ctx: StepContext[DecisionState, None, None]) -> int:
        return 15

    @g.step
    async def is_positive(ctx: StepContext[DecisionState, None, int]) -> int:
        return ctx.inputs

    @g.step
    async def is_negative(ctx: StepContext[DecisionState, None, int]) -> str:
        return 'Negative'

    @g.step
    async def small_positive(ctx: StepContext[DecisionState, None, int]) -> str:
        return 'Small positive'

    @g.step
    async def large_positive(ctx: StepContext[DecisionState, None, int]) -> str:
        return 'Large positive'

    g.add(
        g.edge_from(g.start_node).to(get_number),
        g.edge_from(get_number).to(
            g.decision()
            .branch(g.match(TypeExpression[int], matches=lambda x: x > 0).to(is_positive))
            .branch(g.match(TypeExpression[int], matches=lambda x: x <= 0).to(is_negative))
        ),
        g.edge_from(is_positive).to(
            g.decision()
            .branch(g.match(TypeExpression[int], matches=lambda x: x < 10).to(small_positive))
            .branch(g.match(TypeExpression[int], matches=lambda x: x >= 10).to(large_positive))
        ),
        g.edge_from(is_negative, small_positive, large_positive).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Large positive
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Branching with Labels

Add labels to branches for documentation and diagram generation:

labeled\_branches.py

```python
from dataclasses import dataclass
from typing import Literal

from pydantic_graph.beta import GraphBuilder, StepContext, TypeExpression


@dataclass
class DecisionState:
    pass


async def main():
    g = GraphBuilder(state_type=DecisionState, output_type=str)

    @g.step
    async def choose(ctx: StepContext[DecisionState, None, None]) -> Literal['a', 'b']:
        return 'a'

    @g.step
    async def path_a(ctx: StepContext[DecisionState, None, object]) -> str:
        return 'Path A'

    @g.step
    async def path_b(ctx: StepContext[DecisionState, None, object]) -> str:
        return 'Path B'

    g.add(
        g.edge_from(g.start_node).to(choose),
        g.edge_from(choose).to(
            g.decision()
            .branch(g.match(TypeExpression[Literal['a']]).label('Take path A').to(path_a))
            .branch(g.match(TypeExpression[Literal['b']]).label('Take path B').to(path_b))
        ),
        g.edge_from(path_a, path_b).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=DecisionState())
    print(result)
    #> Path A
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Next Steps

-   Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping
-   Understand [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results
-   See the [API reference](/docs/ai/api/pydantic_graph/beta_decision/#pydantic_graph.beta.decision) for complete decision documentation

---

# [Getting Started](https://pydantic.dev/docs/ai/graph/beta/)

# Getting Started

Beta API

This is the new beta graph API. It provides enhanced capabilities for parallel execution, conditional branching, and complex workflows.

The original graph API is still available (and compatible of interop with the new beta API) and is documented in the [main graph documentation](/docs/ai/graph/).

The beta graph API in `pydantic-graph` provides a powerful builder pattern for constructing parallel execution graphs with:

-   **Step nodes** for executing async functions
-   **Decision nodes** for conditional branching
-   **Spread operations** for parallel processing of iterables
-   **Broadcast operations** for sending the same data to multiple parallel paths
-   **Join nodes and Reducers** for aggregating results from parallel execution

This API is designed for advanced workflows where you want declarative control over parallelism, routing, and data aggregation.

## Installation

The beta graph API is included with `pydantic-graph`:

Terminal

```bash
pip install pydantic-graph
```

Or as part of `pydantic-ai`:

Terminal

```bash
pip install pydantic-ai
```

## Quick Start

Here's a simple example to get you started:

simple\_counter.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class CounterState:
    """State for tracking a counter value."""

    value: int = 0


async def main():
    # Create a graph builder with state and output types
    g = GraphBuilder(state_type=CounterState, output_type=int)

    # Define steps using the decorator
    @g.step
    async def increment(ctx: StepContext[CounterState, None, None]) -> int:
        """Increment the counter and return its value."""
        ctx.state.value += 1
        return ctx.state.value

    @g.step
    async def double_it(ctx: StepContext[CounterState, None, int]) -> int:
        """Double the input value."""
        return ctx.inputs * 2

    # Add edges connecting the nodes
    g.add(
        g.edge_from(g.start_node).to(increment),
        g.edge_from(increment).to(double_it),
        g.edge_from(double_it).to(g.end_node),
    )

    # Build and run the graph
    graph = g.build()
    state = CounterState()
    result = await graph.run(state=state)
    print(f'Result: {result}')
    #> Result: 2
    print(f'Final state: {state.value}')
    #> Final state: 1
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Key Concepts

### GraphBuilder

The [`GraphBuilder`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder) is the main entry point for constructing graphs. It's generic over:

-   `StateT` - The type of mutable state shared across all nodes
-   `DepsT` - The type of dependencies injected into nodes
-   `InputT` - The type of initial input to the graph
-   `OutputT` - The type of final output from the graph

### Steps

Steps are async functions decorated with [`@g.step`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.step) that define the actual work to be done in each node. They receive a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) with access to:

-   `ctx.state` - The mutable graph state
-   `ctx.deps` - Injected dependencies
-   `ctx.inputs` - Input data for this step

### Edges

Edges define the connections between nodes. The builder provides multiple ways to create edges:

-   [`g.add()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add) - Add one or more edge paths
-   [`g.add_edge()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add_edge) - Add a simple edge between two nodes
-   [`g.edge_from()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.edge_from) - Start building a complex edge path

### Start and End Nodes

Every graph has:

-   [`g.start_node`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.start_node) - The entry point receiving initial inputs
-   [`g.end_node`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.end_node) - The exit point producing final outputs

## A More Complex Example

Here's an example showcasing parallel execution with a map operation:

parallel\_processing.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class ProcessingState:
    """State for tracking processing metrics."""

    items_processed: int = 0


async def main():
    g = GraphBuilder(
        state_type=ProcessingState,
        input_type=list[int],
        output_type=list[int],
    )

    @g.step
    async def square(ctx: StepContext[ProcessingState, None, int]) -> int:
        """Square a number and track that we processed it."""
        ctx.state.items_processed += 1
        return ctx.inputs * ctx.inputs

    # Create a join to collect results
    collect_results = g.join(reduce_list_append, initial_factory=list[int])

    # Build the graph with map operation
    g.add(
        g.edge_from(g.start_node).map().to(square),
        g.edge_from(square).to(collect_results),
        g.edge_from(collect_results).to(g.end_node),
    )

    graph = g.build()
    state = ProcessingState()
    result = await graph.run(state=state, inputs=[1, 2, 3, 4, 5])

    print(f'Results: {sorted(result)}')
    #> Results: [1, 4, 9, 16, 25]
    print(f'Items processed: {state.items_processed}')
    #> Items processed: 5
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

In this example:

1.  The start node receives a list of integers
2.  The `.map()` operation fans out each item to a separate parallel execution of the `square` step
3.  All results are collected back together using [`reduce_list_append`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_append)
4.  The joined results flow to the end node

## Next Steps

Explore the detailed documentation for each feature:

-   [**Steps**](/docs/ai/graph/beta/steps) - Learn about step nodes and execution contexts
-   [**Joins**](/docs/ai/graph/beta/joins) - Understand join nodes and reducer patterns
-   [**Decisions**](/docs/ai/graph/beta/decisions) - Implement conditional branching
-   [**Parallel Execution**](/docs/ai/graph/beta/parallel) - Master broadcasting and mapping

## Advanced Execution Control

Beyond the basic [`graph.run()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.run) method, the beta API provides fine-grained control over graph execution.

### Step-by-Step Execution

Use [`graph.iter()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.iter) to execute the graph one step at a time:

step\_by\_step.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class CounterState:
    value: int = 0


async def main():
    g = GraphBuilder(state_type=CounterState, output_type=int)

    @g.step
    async def increment(ctx: StepContext[CounterState, None, None]) -> int:
        ctx.state.value += 1
        return ctx.state.value

    @g.step
    async def double_it(ctx: StepContext[CounterState, None, int]) -> int:
        return ctx.inputs * 2

    g.add(
        g.edge_from(g.start_node).to(increment),
        g.edge_from(increment).to(double_it),
        g.edge_from(double_it).to(g.end_node),
    )

    graph = g.build()
    state = CounterState()

    # Use iter() for step-by-step execution
    async with graph.iter(state=state) as graph_run:
        print(f'Initial state: {state.value}')
        #> Initial state: 0

        # Advance execution step by step
        async for event in graph_run:
            print(f'{state.value=} | {event=}')
            #> state.value=0 | event=[GraphTask(node_id='increment', inputs=None)]
            #> state.value=1 | event=[GraphTask(node_id='double_it', inputs=1)]
            #> state.value=1 | event=[GraphTask(node_id='__end__', inputs=2)]
            #> state.value=1 | event=EndMarker(_value=2)
            if graph_run.output is not None:
                print(f'Final output: {graph_run.output}')
                #> Final output: 2
                break
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

The [`GraphRun`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.GraphRun) object provides:

-   **Async iteration**: Iterate through execution events
-   **`next_task` property**: Inspect upcoming tasks
-   **`output` property**: Check if the graph has completed and get the final output
-   **`next()` method**: Manually advance execution with optional value injection

### Visualizing Graphs

Generate Mermaid diagrams of your graph structure using [`graph.render()`](/docs/ai/api/pydantic_graph/beta_graph/#pydantic_graph.beta.graph.Graph.render):

visualize\_graph.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class SimpleState:
    pass


g = GraphBuilder(state_type=SimpleState, output_type=str)

@g.step
async def step_a(ctx: StepContext[SimpleState, None, None]) -> int:
    return 10

@g.step
async def step_b(ctx: StepContext[SimpleState, None, int]) -> str:
    return f'Result: {ctx.inputs}'

g.add(
    g.edge_from(g.start_node).to(step_a),
    g.edge_from(step_a).to(step_b),
    g.edge_from(step_b).to(g.end_node),
)

graph = g.build()

# Generate a Mermaid diagram
mermaid_diagram = graph.render(title='My Graph', direction='LR')
print(mermaid_diagram)
"""
---
title: My Graph
---
stateDiagram-v2
  direction LR
  step_a
  step_b

  [*] --> step_a
  step_a --> step_b
  step_b --> [*]
"""
```

The rendered diagram can be displayed in documentation, notebooks, or any tool that supports Mermaid syntax.

## Comparison with Original API

The original graph API (documented in the [main graph page](/docs/ai/graph/)) uses a class-based approach with [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) subclasses. The beta API uses a builder pattern with decorated functions, which provides:

**Advantages:**

-   More concise syntax for simple workflows
-   Explicit control over parallelism with map/broadcast
-   Built-in reducers for common aggregation patterns
-   Easier to visualize complex data flows

**Trade-offs:**

-   Requires understanding of builder patterns
-   Less object-oriented, more functional style

Both APIs are fully supported and can even be integrated together when needed.

## Persistence and Resumability

No Native Persistence

Unlike the [original Graph API](/docs/ai/graph/#state-persistence), the beta graph API does not include built-in state persistence. This is due to the [complexity of achieving consistent snapshotting with parallel execution](https://github.com/pydantic/pydantic-ai/issues/530#issuecomment-3504609992).

For workflows that need to preserve progress across failures, restarts, or long-running operations, use one of the supported [durable execution](/docs/ai/integrations/durable_execution/overview) solutions.

---

# [Joins & Reducers](https://pydantic.dev/docs/ai/graph/beta/joins/)

# Joins & Reducers

Join nodes synchronize and aggregate data from parallel execution paths. They use **Reducers** to combine multiple inputs into a single output.

When you use [parallel execution](/docs/ai/graph/beta/parallel) (broadcasting or mapping), you often need to collect and combine the results. Join nodes serve this purpose by:

1.  Waiting for all parallel tasks to complete
2.  Aggregating their outputs using a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction)
3.  Passing the aggregated result to the next node

## Creating Joins

Create a join using `GraphBuilder.join` with a reducer function and initial value or factory:

basic\_join.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


g = GraphBuilder(state_type=SimpleState, output_type=list[int])

@g.step
async def generate_numbers(ctx: StepContext[SimpleState, None, None]) -> list[int]:
    return [1, 2, 3, 4, 5]

@g.step
async def square(ctx: StepContext[SimpleState, None, int]) -> int:
    return ctx.inputs * ctx.inputs

# Create a join to collect all squared values
collect = g.join(reduce_list_append, initial_factory=list[int])

g.add(
    g.edge_from(g.start_node).to(generate_numbers),
    g.edge_from(generate_numbers).map().to(square),
    g.edge_from(square).to(collect),
    g.edge_from(collect).to(g.end_node),
)

graph = g.build()

async def main():
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [1, 4, 9, 16, 25]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Built-in Reducers

Pydantic Graph provides several common reducer types out of the box:

### `reduce_list_append`

[`reduce_list_append`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_append) collects all inputs into a list:

list\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[str])

    @g.step
    async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [10, 20, 30]

    @g.step
    async def to_string(ctx: StepContext[SimpleState, None, int]) -> str:
        return f'value-{ctx.inputs}'

    collect = g.join(reduce_list_append, initial_factory=list[str])

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(to_string),
        g.edge_from(to_string).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> ['value-10', 'value-20', 'value-30']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### `reduce_list_extend`

[`reduce_list_extend`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_list_extend) extends a list with an iterable of items:

list\_extend\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_extend


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.step
    async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [1, 2, 3]

    @g.step
    async def create_range(ctx: StepContext[SimpleState, None, int]) -> list[int]:
        """Create a range from 0 to the input value."""
        return list(range(ctx.inputs))

    collect = g.join(reduce_list_extend, initial_factory=list[int])

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(create_range),
        g.edge_from(create_range).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [0, 0, 0, 1, 1, 2]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### `reduce_dict_update`

[`reduce_dict_update`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_dict_update) merges dictionaries together:

dict\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_dict_update


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=dict[str, int])

    @g.step
    async def generate_keys(ctx: StepContext[SimpleState, None, None]) -> list[str]:
        return ['apple', 'banana', 'cherry']

    @g.step
    async def create_entry(ctx: StepContext[SimpleState, None, str]) -> dict[str, int]:
        return {ctx.inputs: len(ctx.inputs)}

    merge = g.join(reduce_dict_update, initial_factory=dict[str, int])

    g.add(
        g.edge_from(g.start_node).to(generate_keys),
        g.edge_from(generate_keys).map().to(create_entry),
        g.edge_from(create_entry).to(merge),
        g.edge_from(merge).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    result = {k: result[k] for k in sorted(result)}  # force deterministic ordering
    print(result)
    #> {'apple': 5, 'banana': 6, 'cherry': 6}
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### `reduce_null`

[`reduce_null`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_null) discards all inputs and returns `None`. Useful when you only care about side effects:

null\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_null


@dataclass
class CounterState:
    total: int = 0


async def main():
    g = GraphBuilder(state_type=CounterState, output_type=int)

    @g.step
    async def generate(ctx: StepContext[CounterState, None, None]) -> list[int]:
        return [1, 2, 3, 4, 5]

    @g.step
    async def accumulate(ctx: StepContext[CounterState, None, int]) -> int:
        ctx.state.total += ctx.inputs
        return ctx.inputs

    # We don't care about the outputs, only the side effect on state
    ignore = g.join(reduce_null, initial=None)

    @g.step
    async def get_total(ctx: StepContext[CounterState, None, None]) -> int:
        return ctx.state.total

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(accumulate),
        g.edge_from(accumulate).to(ignore),
        g.edge_from(ignore).to(get_total),
        g.edge_from(get_total).to(g.end_node),
    )

    graph = g.build()
    state = CounterState()
    result = await graph.run(state=state)
    print(result)
    #> 15
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### `reduce_sum`

[`reduce_sum`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.reduce_sum) sums numeric values:

sum\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_sum


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=int)

    @g.step
    async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [10, 20, 30, 40]

    @g.step
    async def identity(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs

    sum_join = g.join(reduce_sum, initial=0)

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(identity),
        g.edge_from(identity).to(sum_join),
        g.edge_from(sum_join).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(result)
    #> 100
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### `ReduceFirstValue`

[`ReduceFirstValue`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReduceFirstValue) returns the first value it receives and cancels all other parallel tasks. This is useful for "race" scenarios where you want the first successful result:

first\_value\_reducer.py

```python
import asyncio
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import ReduceFirstValue


@dataclass
class SimpleState:
    tasks_completed: int = 0


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=str)

    @g.step
    async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [1, 12, 13, 14, 15]

    @g.step
    async def slow_process(ctx: StepContext[SimpleState, None, int]) -> str:
        """Simulate variable processing times."""
        # Simulate different delays
        await asyncio.sleep(ctx.inputs * 0.1)
        ctx.state.tasks_completed += 1
        return f'Result from task {ctx.inputs}'

    # Use ReduceFirstValue to get the first result and cancel the rest
    first_result = g.join(ReduceFirstValue[str](), initial=None, node_id='first_result')

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(slow_process),
        g.edge_from(slow_process).to(first_result),
        g.edge_from(first_result).to(g.end_node),
    )

    graph = g.build()
    state = SimpleState()
    result = await graph.run(state=state)

    print(result)
    #> Result from task 1
    print(f'Tasks completed: {state.tasks_completed}')
    #> Tasks completed: 1
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Custom Reducers

Create custom reducers by defining a [`ReducerFunction`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerFunction):

custom\_reducer.py

```python

from pydantic_graph.beta import GraphBuilder, StepContext


def reduce_sum(current: int, inputs: int) -> int:
    """A reducer that sums numbers."""
    return current + inputs


async def main():
    g = GraphBuilder(output_type=int)

    @g.step
    async def generate(ctx: StepContext[None, None, None]) -> list[int]:
        return [5, 10, 15, 20]

    @g.step
    async def identity(ctx: StepContext[None, None, int]) -> int:
        return ctx.inputs

    sum_join = g.join(reduce_sum, initial=0)

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(identity),
        g.edge_from(identity).to(sum_join),
        g.edge_from(sum_join).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run()
    print(result)
    #> 50
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Reducers with State Access

Reducers can access and modify the graph state:

stateful\_reducer.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import ReducerContext


@dataclass
class MetricsState:
    total_count: int = 0
    total_sum: int = 0


@dataclass
class ReducedMetrics:
    count: int = 0
    sum: int = 0


def reduce_metrics_sum(ctx: ReducerContext[MetricsState, None], current: ReducedMetrics, inputs: int) -> ReducedMetrics:
    ctx.state.total_count += 1
    ctx.state.total_sum += inputs
    return ReducedMetrics(count=current.count + 1, sum=current.sum + inputs)

def reduce_metrics_max(current: ReducedMetrics, inputs: ReducedMetrics) -> ReducedMetrics:
    return ReducedMetrics(count=max(current.count, inputs.count), sum=max(current.sum, inputs.sum))


async def main():
    g = GraphBuilder(state_type=MetricsState, output_type=dict[str, int])

    @g.step
    async def generate(ctx: StepContext[object, None, None]) -> list[int]:
        return [1, 3, 5, 7, 9, 10, 20, 30, 40]

    @g.step
    async def process_even(ctx: StepContext[MetricsState, None, int]) -> int:
        return ctx.inputs * 2

    @g.step
    async def process_odd(ctx: StepContext[MetricsState, None, int]) -> int:
        return ctx.inputs * 3

    metrics_even = g.join(reduce_metrics_sum, initial_factory=ReducedMetrics, node_id='metrics_even')
    metrics_odd = g.join(reduce_metrics_sum, initial_factory=ReducedMetrics, node_id='metrics_odd')
    metrics_max = g.join(reduce_metrics_max, initial_factory=ReducedMetrics, node_id='metrics_max')

    g.add(
        g.edge_from(g.start_node).to(generate),
        # Send even and odd numbers to their respective `process` steps
        g.edge_from(generate).map().to(
            g.decision()
            .branch(g.match(int, matches=lambda x: x % 2 == 0).label('even').to(process_even))
            .branch(g.match(int, matches=lambda x: x % 2 == 1).label('odd').to(process_odd))
        ),
        # Reduce metrics for even and odd numbers separately
        g.edge_from(process_even).to(metrics_even),
        g.edge_from(process_odd).to(metrics_odd),
        # Aggregate the max values for each field
        g.edge_from(metrics_even).to(metrics_max),
        g.edge_from(metrics_odd).to(metrics_max),
        # Finish the graph run with the final reduced value
        g.edge_from(metrics_max).to(g.end_node),
    )

    graph = g.build()
    state = MetricsState()
    result = await graph.run(state=state)

    print(f'Result: {result}')
    #> Result: ReducedMetrics(count=5, sum=200)
    print(f'State total_count: {state.total_count}')
    #> State total_count: 9
    print(f'State total_sum: {state.total_sum}')
    #> State total_sum: 275
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### Canceling Sibling Tasks

Reducers with access to [`ReducerContext`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerContext) can call [`ctx.cancel_sibling_tasks()`](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join.ReducerContext.cancel_sibling_tasks) to cancel all other parallel tasks in the same fork. This is useful for early termination when you've found what you need:

cancel\_siblings.py

```python
import asyncio
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import ReducerContext


@dataclass
class SearchState:
    searches_completed: int = 0


def reduce_find_match(ctx: ReducerContext[SearchState, None], current: str | None, inputs: str) -> str | None:
    """Return the first input that contains 'target' and cancel remaining tasks."""
    if current is not None:
        # We already found a match, ignore subsequent inputs
        return current
    if 'target' in inputs:
        # Found a match! Cancel all other parallel tasks
        ctx.cancel_sibling_tasks()
        return inputs
    return None


async def main():
    g = GraphBuilder(state_type=SearchState, output_type=str | None)

    @g.step
    async def generate_searches(ctx: StepContext[SearchState, None, None]) -> list[str]:
        return ['item1', 'item2', 'target_item', 'item4', 'item5']

    @g.step
    async def search(ctx: StepContext[SearchState, None, str]) -> str:
        """Simulate a slow search operation."""
        # make the search artificially slower for 'item4' and 'item5'
        search_duration = 0.1 if ctx.inputs not in {'item4', 'item5'} else 1.0
        await asyncio.sleep(search_duration)
        ctx.state.searches_completed += 1
        return ctx.inputs

    find_match = g.join(reduce_find_match, initial=None)

    g.add(
        g.edge_from(g.start_node).to(generate_searches),
        g.edge_from(generate_searches).map().to(search),
        g.edge_from(search).to(find_match),
        g.edge_from(find_match).to(g.end_node),
    )

    graph = g.build()
    state = SearchState()
    result = await graph.run(state=state)

    print(f'Found: {result}')
    #> Found: target_item
    print(f'Searches completed: {state.searches_completed}')
    #> Searches completed: 3
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

Note that only 3 searches completed instead of all 5, because the reducer canceled the remaining tasks after finding a match.

## Multiple Joins

A graph can have multiple independent joins:

multiple\_joins.py

```python
from dataclasses import dataclass, field

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class MultiState:
    results: dict[str, list[int]] = field(default_factory=dict)


async def main():
    g = GraphBuilder(state_type=MultiState, output_type=dict[str, list[int]])

    @g.step
    async def source_a(ctx: StepContext[MultiState, None, None]) -> list[int]:
        return [1, 2, 3]

    @g.step
    async def source_b(ctx: StepContext[MultiState, None, None]) -> list[int]:
        return [10, 20]

    @g.step
    async def process_a(ctx: StepContext[MultiState, None, int]) -> int:
        return ctx.inputs * 2

    @g.step
    async def process_b(ctx: StepContext[MultiState, None, int]) -> int:
        return ctx.inputs * 3

    join_a = g.join(reduce_list_append, initial_factory=list[int], node_id='join_a')
    join_b = g.join(reduce_list_append, initial_factory=list[int], node_id='join_b')

    @g.step
    async def store_a(ctx: StepContext[MultiState, None, list[int]]) -> None:
        ctx.state.results['a'] = ctx.inputs

    @g.step
    async def store_b(ctx: StepContext[MultiState, None, list[int]]) -> None:
        ctx.state.results['b'] = ctx.inputs

    @g.step
    async def combine(ctx: StepContext[MultiState, None, None]) -> dict[str, list[int]]:
        return ctx.state.results

    g.add(
        g.edge_from(g.start_node).to(source_a, source_b),
        g.edge_from(source_a).map().to(process_a),
        g.edge_from(source_b).map().to(process_b),
        g.edge_from(process_a).to(join_a),
        g.edge_from(process_b).to(join_b),
        g.edge_from(join_a).to(store_a),
        g.edge_from(join_b).to(store_b),
        g.edge_from(store_a, store_b).to(combine),
        g.edge_from(combine).to(g.end_node),
    )

    graph = g.build()
    state = MultiState()
    result = await graph.run(state=state)

    print(f"Group A: {sorted(result['a'])}")
    #> Group A: [2, 4, 6]
    print(f"Group B: {sorted(result['b'])}")
    #> Group B: [30, 60]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Customizing Join Nodes

### Custom Node IDs

Like steps, joins can have custom IDs:

join\_custom\_id.py

```python
from pydantic_graph.beta.join import reduce_list_append

from basic_join import g

my_join = g.join(reduce_list_append, initial_factory=list[int], node_id='my_custom_join_id')
```

## How Joins Work

Internally, the graph tracks which "fork" each parallel task belongs to. A join:

1.  Identifies its parent fork (the fork that created the parallel paths)
2.  Waits for all tasks from that fork to reach the join
3.  Calls `reduce()` for each incoming value
4.  Calls `finalize()` once all values are received
5.  Passes the finalized result to downstream nodes

This ensures proper synchronization even with nested parallel operations.

## Next Steps

-   Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping
-   Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes
-   See the [API reference](/docs/ai/api/pydantic_graph/beta_join/#pydantic_graph.beta.join) for complete reducer documentation

---

# [Parallel Execution](https://pydantic.dev/docs/ai/graph/beta/parallel/)

# Parallel Execution

The beta graph API provides two powerful mechanisms for parallel execution: **broadcasting** and **mapping**.

-   **Broadcasting** - Send the same data to multiple parallel paths
-   **Spreading** - Fan out items from an iterable to parallel paths

Both create "forks" in the execution graph that can later be synchronized with [join nodes](/docs/ai/graph/beta/joins).

## Broadcasting

Broadcasting sends identical data to multiple destinations simultaneously:

basic\_broadcast.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.step
    async def source(ctx: StepContext[SimpleState, None, None]) -> int:
        return 10

    @g.step
    async def add_one(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 1

    @g.step
    async def add_two(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 2

    @g.step
    async def add_three(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 3

    collect = g.join(reduce_list_append, initial_factory=list[int])

    # Broadcasting: send the value from source to all three steps
    g.add(
        g.edge_from(g.start_node).to(source),
        g.edge_from(source).to(add_one, add_two, add_three),
        g.edge_from(add_one, add_two, add_three).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [11, 12, 13]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

All three steps receive the same input value (`10`) and execute in parallel.

## Spreading

Spreading fans out elements from an iterable, processing each element in parallel:

basic\_map.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.step
    async def generate_list(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [1, 2, 3, 4, 5]

    @g.step
    async def square(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs * ctx.inputs

    collect = g.join(reduce_list_append, initial_factory=list[int])

    # Spreading: each item in the list gets its own parallel execution
    g.add(
        g.edge_from(g.start_node).to(generate_list),
        g.edge_from(generate_list).map().to(square),
        g.edge_from(square).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [1, 4, 9, 16, 25]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### Spreading AsyncIterables

The `.map()` operation also works with `AsyncIterable` values. When mapping over an async iterable, the graph creates parallel tasks dynamically as values are yielded. This is particularly useful for streaming data or processing data that's being generated on-the-fly:

async\_iterable\_map.py

```python
import asyncio
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.stream
    async def stream_numbers(ctx: StepContext[SimpleState, None, None]):
        """Stream numbers with delays to simulate real-time data."""
        for i in range(1, 4):
            await asyncio.sleep(0.05)  # Simulate delay
            yield i

    @g.step
    async def triple(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs * 3

    collect = g.join(reduce_list_append, initial_factory=list[int])

    g.add(
        g.edge_from(g.start_node).to(stream_numbers),
        # Map over the async iterable - tasks created as items are yielded
        g.edge_from(stream_numbers).map().to(triple),
        g.edge_from(triple).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [3, 6, 9]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

This allows for progressive processing where downstream steps can start working on early results while later results are still being generated.

### Using `add_mapping_edge()`

The convenience method [`add_mapping_edge()`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.add_mapping_edge) provides a simpler syntax:

mapping\_convenience.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[str])

    @g.step
    async def generate_numbers(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [10, 20, 30]

    @g.step
    async def stringify(ctx: StepContext[SimpleState, None, int]) -> str:
        return f'Value: {ctx.inputs}'

    collect = g.join(reduce_list_append, initial_factory=list[str])

    g.add(g.edge_from(g.start_node).to(generate_numbers))
    g.add_mapping_edge(generate_numbers, stringify)
    g.add(
        g.edge_from(stringify).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> ['Value: 10', 'Value: 20', 'Value: 30']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Empty Iterables

When mapping an empty iterable, you can specify a `downstream_join_id` to ensure the join still executes:

empty\_map.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.step
    async def generate_empty(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return []

    @g.step
    async def double(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs * 2

    collect = g.join(reduce_list_append, initial_factory=list[int])

    g.add(g.edge_from(g.start_node).to(generate_empty))
    g.add_mapping_edge(generate_empty, double, downstream_join_id=collect.id)
    g.add(
        g.edge_from(double).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(result)
    #> []
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Nested Parallel Operations

You can nest broadcasts and maps for complex parallel patterns:

### Spread then Broadcast

map\_then\_broadcast.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[int])

    @g.step
    async def generate_list(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [10, 20]

    @g.step
    async def add_one(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 1

    @g.step
    async def add_two(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 2

    collect = g.join(reduce_list_append, initial_factory=list[int])

    g.add(
        g.edge_from(g.start_node).to(generate_list),
        # Spread the list, then broadcast each item to both steps
        g.edge_from(generate_list).map().to(add_one, add_two),
        g.edge_from(add_one, add_two).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [11, 12, 21, 22]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

The result contains:

-   From 10: `10+1=11` and `10+2=12`
-   From 20: `20+1=21` and `20+2=22`

### Multiple Sequential Spreads

sequential\_maps.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[str])

    @g.step
    async def generate_pairs(ctx: StepContext[SimpleState, None, None]) -> list[tuple[int, int]]:
        return [(1, 2), (3, 4)]

    @g.step
    async def unpack_pair(ctx: StepContext[SimpleState, None, tuple[int, int]]) -> list[int]:
        return [ctx.inputs[0], ctx.inputs[1]]

    @g.step
    async def stringify(ctx: StepContext[SimpleState, None, int]) -> str:
        return f'num:{ctx.inputs}'

    collect = g.join(reduce_list_append, initial_factory=list[str])

    g.add(
        g.edge_from(g.start_node).to(generate_pairs),
        # First map: one task per tuple
        g.edge_from(generate_pairs).map().to(unpack_pair),
        # Second map: one task per number in each tuple
        g.edge_from(unpack_pair).map().to(stringify),
        g.edge_from(stringify).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> ['num:1', 'num:2', 'num:3', 'num:4']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Edge Labels

Add labels to parallel edges for better documentation:

labeled\_parallel.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[str])

    @g.step
    async def generate(ctx: StepContext[SimpleState, None, None]) -> list[int]:
        return [1, 2, 3]

    @g.step
    async def process(ctx: StepContext[SimpleState, None, int]) -> str:
        return f'item-{ctx.inputs}'

    collect = g.join(reduce_list_append, initial_factory=list[str])

    g.add(g.edge_from(g.start_node).to(generate))
    g.add_mapping_edge(
        generate,
        process,
        pre_map_label='before map',
        post_map_label='after map',
    )
    g.add(
        g.edge_from(process).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> ['item-1', 'item-2', 'item-3']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## State Sharing in Parallel Execution

All parallel tasks share the same graph state. Be careful with mutations:

parallel\_state.py

```python
from dataclasses import dataclass, field

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class CounterState:
    values: list[int] = field(default_factory=list)


async def main():
    g = GraphBuilder(state_type=CounterState, output_type=list[int])

    @g.step
    async def generate(ctx: StepContext[CounterState, None, None]) -> list[int]:
        return [1, 2, 3]

    @g.step
    async def track_and_square(ctx: StepContext[CounterState, None, int]) -> int:
        # All parallel tasks mutate the same state
        ctx.state.values.append(ctx.inputs)
        return ctx.inputs * ctx.inputs

    collect = g.join(reduce_list_append, initial_factory=list[int])

    g.add(
        g.edge_from(g.start_node).to(generate),
        g.edge_from(generate).map().to(track_and_square),
        g.edge_from(track_and_square).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    state = CounterState()
    result = await graph.run(state=state)

    print(f'Squared: {sorted(result)}')
    #> Squared: [1, 4, 9]
    print(f'Tracked: {sorted(state.values)}')
    #> Tracked: [1, 2, 3]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Edge Transformations

You can transform data inline as it flows along edges using the `.transform()` method:

edge\_transform.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=str)

    @g.step
    async def generate_number(ctx: StepContext[SimpleState, None, None]) -> int:
        return 42

    @g.step
    async def format_output(ctx: StepContext[SimpleState, None, str]) -> str:
        return f'The answer is: {ctx.inputs}'

    # Transform the number to a string inline
    g.add(
        g.edge_from(g.start_node).to(generate_number),
        g.edge_from(generate_number).transform(lambda ctx: str(ctx.inputs * 2)).to(format_output),
        g.edge_from(format_output).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(result)
    #> The answer is: 84
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

The transform function receives a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) with the current inputs and has access to state and dependencies. This is useful for:

-   Converting data types between incompatible steps
-   Extracting specific fields from complex objects
-   Applying simple computations without creating a full step
-   Adapting data formats during routing

Transforms can be chained and combined with other edge operations like `.map()` and `.label()`:

chained\_transforms.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=list[str])

    @g.step
    async def generate_data(ctx: StepContext[SimpleState, None, None]) -> list[dict[str, int]]:
        return [{'value': 10}, {'value': 20}, {'value': 30}]

    @g.step
    async def process_number(ctx: StepContext[SimpleState, None, int]) -> str:
        return f'Processed: {ctx.inputs}'

    collect = g.join(reduce_list_append, initial_factory=list[str])

    g.add(
        g.edge_from(g.start_node).to(generate_data),
        # Transform to extract values, then map over them
        g.edge_from(generate_data)
        .transform(lambda ctx: [item['value'] for item in ctx.inputs])
        .label('Extract values')
        .map()
        .to(process_number),
        g.edge_from(process_number).to(collect),
        g.edge_from(collect).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> ['Processed: 10', 'Processed: 20', 'Processed: 30']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Next Steps

-   Learn about [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results
-   Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes
-   See the [steps documentation](/docs/ai/graph/beta/steps) for more on step execution

---

# [Steps](https://pydantic.dev/docs/ai/graph/beta/steps/)

# Steps

Steps are the fundamental units of work in a graph. They're async functions that receive a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) and return a value.

## Creating Steps

Steps are created using the [`@g.step`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.step) decorator on the [`GraphBuilder`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder):

basic\_step.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class MyState:
    counter: int = 0

g = GraphBuilder(state_type=MyState, output_type=int)

@g.step
async def increment(ctx: StepContext[MyState, None, None]) -> int:
    ctx.state.counter += 1
    return ctx.state.counter

g.add(
    g.edge_from(g.start_node).to(increment),
    g.edge_from(increment).to(g.end_node),
)

graph = g.build()

async def main():
    state = MyState()
    result = await graph.run(state=state)
    print(result)
    #> 1
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Step Context

Every step function receives a [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) as its first parameter. The context provides access to:

-   `ctx.state` - The mutable graph state (type: `StateT`)
-   `ctx.deps` - Injected dependencies (type: `DepsT`)
-   `ctx.inputs` - Input data for this step (type: `InputT`)

### Accessing State

State is shared across all steps in a graph and can be freely mutated:

state\_access.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class AppState:
    messages: list[str]


async def main():
    g = GraphBuilder(state_type=AppState, output_type=list[str])

    @g.step
    async def add_hello(ctx: StepContext[AppState, None, None]) -> None:
        ctx.state.messages.append('Hello')

    @g.step
    async def add_world(ctx: StepContext[AppState, None, None]) -> None:
        ctx.state.messages.append('World')

    @g.step
    async def get_messages(ctx: StepContext[AppState, None, None]) -> list[str]:
        return ctx.state.messages

    g.add(
        g.edge_from(g.start_node).to(add_hello),
        g.edge_from(add_hello).to(add_world),
        g.edge_from(add_world).to(get_messages),
        g.edge_from(get_messages).to(g.end_node),
    )

    graph = g.build()
    state = AppState(messages=[])
    result = await graph.run(state=state)
    print(result)
    #> ['Hello', 'World']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### Working with Inputs

Steps can receive and transform input data:

step\_inputs.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(
        state_type=SimpleState,
        input_type=int,
        output_type=str,
    )

    @g.step
    async def double_it(ctx: StepContext[SimpleState, None, int]) -> int:
        """Double the input value."""
        return ctx.inputs * 2

    @g.step
    async def stringify(ctx: StepContext[SimpleState, None, int]) -> str:
        """Convert to a formatted string."""
        return f'Result: {ctx.inputs}'

    g.add(
        g.edge_from(g.start_node).to(double_it),
        g.edge_from(double_it).to(stringify),
        g.edge_from(stringify).to(g.end_node),
    )

    graph = g.build()
    result = await graph.run(state=SimpleState(), inputs=21)
    print(result)
    #> Result: 42
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Dependency Injection

Steps can access injected dependencies through `ctx.deps`:

dependencies.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class AppState:
    pass


@dataclass
class AppDeps:
    """Dependencies injected into the graph."""

    multiplier: int


async def main():
    g = GraphBuilder(
        state_type=AppState,
        deps_type=AppDeps,
        input_type=int,
        output_type=int,
    )

    @g.step
    async def multiply(ctx: StepContext[AppState, AppDeps, int]) -> int:
        """Multiply input by the injected multiplier."""
        return ctx.inputs * ctx.deps.multiplier

    g.add(
        g.edge_from(g.start_node).to(multiply),
        g.edge_from(multiply).to(g.end_node),
    )

    graph = g.build()
    deps = AppDeps(multiplier=10)
    result = await graph.run(state=AppState(), deps=deps, inputs=5)
    print(result)
    #> 50
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Customizing Steps

### Custom Node IDs

By default, step node IDs are inferred from the function name. You can override this:

custom\_id.py

```python
from pydantic_graph.beta import StepContext

from basic_step import MyState, g


@g.step(node_id='my_custom_id')
async def my_step(ctx: StepContext[MyState, None, None]) -> int:
    return 42

# The node ID is now 'my_custom_id' instead of 'my_step'
```

### Human-Readable Labels

Labels provide documentation for diagram generation:

labels.py

```python
from pydantic_graph.beta import StepContext

from basic_step import MyState, g


@g.step(label='Increment the counter')
async def increment(ctx: StepContext[MyState, None, None]) -> int:
    ctx.state.counter += 1
    return ctx.state.counter

# Access the label programmatically
print(increment.label)
#> Increment the counter
```

## Sequential Steps

Multiple steps can be chained sequentially:

sequential.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class MathState:
    operations: list[str]


async def main():
    g = GraphBuilder(
        state_type=MathState,
        input_type=int,
        output_type=int,
    )

    @g.step
    async def add_five(ctx: StepContext[MathState, None, int]) -> int:
        ctx.state.operations.append('add 5')
        return ctx.inputs + 5

    @g.step
    async def multiply_by_two(ctx: StepContext[MathState, None, int]) -> int:
        ctx.state.operations.append('multiply by 2')
        return ctx.inputs * 2

    @g.step
    async def subtract_three(ctx: StepContext[MathState, None, int]) -> int:
        ctx.state.operations.append('subtract 3')
        return ctx.inputs - 3

    # Connect steps sequentially
    g.add(
        g.edge_from(g.start_node).to(add_five),
        g.edge_from(add_five).to(multiply_by_two),
        g.edge_from(multiply_by_two).to(subtract_three),
        g.edge_from(subtract_three).to(g.end_node),
    )

    graph = g.build()
    state = MathState(operations=[])
    result = await graph.run(state=state, inputs=10)

    print(f'Result: {result}')
    #> Result: 27
    print(f'Operations: {state.operations}')
    #> Operations: ['add 5', 'multiply by 2', 'subtract 3']
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

The computation is: `(10 + 5) * 2 - 3 = 27`

## Streaming Steps

In addition to regular steps that return a single value, you can create streaming steps that yield multiple values over time using the [`@g.stream`](/docs/ai/api/pydantic_graph/beta_graph_builder/#pydantic_graph.beta.graph_builder.GraphBuilder.stream) decorator:

streaming\_step.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext
from pydantic_graph.beta.join import reduce_list_append


@dataclass
class SimpleState:
    pass


g = GraphBuilder(state_type=SimpleState, output_type=list[int])

@g.stream
async def generate_stream(ctx: StepContext[SimpleState, None, None]):
    """Stream numbers from 1 to 5."""
    for i in range(1, 6):
        yield i

@g.step
async def square(ctx: StepContext[SimpleState, None, int]) -> int:
    return ctx.inputs * ctx.inputs

collect = g.join(reduce_list_append, initial_factory=list[int])

g.add(
    g.edge_from(g.start_node).to(generate_stream),
    # The stream output is an AsyncIterable, so we can map over it
    g.edge_from(generate_stream).map().to(square),
    g.edge_from(square).to(collect),
    g.edge_from(collect).to(g.end_node),
)

graph = g.build()

async def main():
    result = await graph.run(state=SimpleState())
    print(sorted(result))
    #> [1, 4, 9, 16, 25]
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

### How Streaming Steps Work

Streaming steps return an `AsyncIterable` that yields values over time. When you use `.map()` on a streaming step's output, the graph processes each yielded value as it becomes available, creating parallel tasks dynamically. This is particularly useful for:

-   Processing data from APIs that stream responses
-   Handling real-time data feeds
-   Progressive processing of large datasets
-   Any scenario where you want to start processing results before all data is available

Like regular steps, streaming steps can also have custom node IDs and labels:

labeled\_stream.py

```python
from pydantic_graph.beta import StepContext

from streaming_step import SimpleState, g


@g.stream(node_id='my_stream', label='Generate numbers progressively')
async def labeled_stream(ctx: StepContext[SimpleState, None, None]):
    for i in range(10):
        yield i
```

## Edge Building Convenience Methods

The builder provides helper methods for common edge patterns:

### Simple Edges with `add_edge()`

add\_edge\_example.py

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class SimpleState:
    pass


async def main():
    g = GraphBuilder(state_type=SimpleState, output_type=int)

    @g.step
    async def step_a(ctx: StepContext[SimpleState, None, None]) -> int:
        return 10

    @g.step
    async def step_b(ctx: StepContext[SimpleState, None, int]) -> int:
        return ctx.inputs + 5

    # Using add_edge() for simple connections
    g.add_edge(g.start_node, step_a)
    g.add_edge(step_a, step_b, label='from a to b')
    g.add_edge(step_b, g.end_node)

    graph = g.build()
    result = await graph.run(state=SimpleState())
    print(result)
    #> 15
```

_(This example is complete, it can be run "as is" -- you'll need to add `import asyncio; asyncio.run(main())` to run `main`)_

## Type Safety

The beta graph API provides strong type checking through generics. Type parameters on [`StepContext`](/docs/ai/api/pydantic_graph/beta_step/#pydantic_graph.beta.step.StepContext) ensure:

-   State access is properly typed
-   Dependencies are correctly typed
-   Input/output types match across edges

```python
from dataclasses import dataclass

from pydantic_graph.beta import GraphBuilder, StepContext


@dataclass
class MyState:
    pass

g = GraphBuilder(state_type=MyState, output_type=str)

# Type checker will catch mismatches
@g.step
async def expects_int(ctx: StepContext[MyState, None, int]) -> str:
    return str(ctx.inputs)

@g.step
async def returns_str(ctx: StepContext[MyState, None, None]) -> str:
    return 'hello'

# This would be a type error - expects_int needs int input, but returns_str outputs str
# g.add(g.edge_from(returns_str).to(expects_int))  # Type error!
```

## Next Steps

-   Learn about [parallel execution](/docs/ai/graph/beta/parallel) with broadcasting and mapping
-   Understand [join nodes](/docs/ai/graph/beta/joins) for aggregating parallel results
-   Explore [conditional branching](/docs/ai/graph/beta/decisions) with decision nodes

---

# [Overview](https://pydantic.dev/docs/ai/graph/graph/)

# Overview

Don't use a nail gun unless you need a nail gun

If Pydantic AI [agents](/docs/ai/core-concepts/agent) are a hammer, and [multi-agent workflows](/docs/ai/guides/multi-agent-applications) are a sledgehammer, then graphs are a nail gun:

-   sure, nail guns look cooler than hammers
-   but nail guns take a lot more setup than hammers
-   and nail guns don't make you a better builder, they make you a builder with a nail gun
-   Lastly, (and at the risk of torturing this metaphor), if you're a fan of medieval tools like mallets and untyped Python, you probably won't like nail guns or our approach to graphs. (But then again, if you're not a fan of type hints in Python, you've probably already bounced off Pydantic AI to use one of the toy agent frameworks -- good luck, and feel free to borrow my sledgehammer when you realize you need it)

In short, graphs are a powerful tool, but they're not the right tool for every job. Please consider other [multi-agent approaches](/docs/ai/guides/multi-agent-applications) before proceeding.

If you're not confident a graph-based approach is a good idea, it might be unnecessary.

Graphs and finite state machines (FSMs) are a powerful abstraction to model, execute, control and visualize complex workflows.

Alongside Pydantic AI, we've developed `pydantic-graph` -- an async graph and state machine library for Python where nodes and edges are defined using type hints.

While this library is developed as part of Pydantic AI; it has no dependency on `pydantic-ai` and can be considered as a pure graph-based state machine library. You may find it useful whether or not you're using Pydantic AI or even building with GenAI.

`pydantic-graph` is designed for advanced users and makes heavy use of Python generics and type hints. It is not designed to be as beginner-friendly as Pydantic AI.

## Installation

`pydantic-graph` is a required dependency of `pydantic-ai`, and an optional dependency of `pydantic-ai-slim`, see [installation instructions](/docs/ai/overview/install#slim-install) for more information. You can also install it directly:

-   [pip](#tab-panel-22)
-   [uv](#tab-panel-23)

Terminal

```bash
pip install pydantic-graph
```

Terminal

```bash
uv add pydantic-graph
```

## Graph Types

`pydantic-graph` is made up of a few key components:

### GraphRunContext

[`GraphRunContext`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.GraphRunContext) -- The context for the graph run, similar to Pydantic AI's [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext). This holds the state of the graph and dependencies and is passed to nodes when they're run.

`GraphRunContext` is generic in the state type of the graph it's used in, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT).

### End

[`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) -- return value to indicate the graph run should end.

`End` is generic in the graph return type of the graph it's used in, [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT).

### Nodes

Subclasses of [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) define nodes for execution in the graph.

Nodes, which are generally [`dataclass`es](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass), generally consist of:

-   fields containing any parameters required/optional when calling the node
-   the business logic to execute the node, in the [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method
-   return annotations of the [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method, which are read by `pydantic-graph` to determine the outgoing edges of the node

Nodes are generic in:

-   **state**, which must have the same type as the state of graphs they're included in, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT) has a default of `None`, so if you're not using state you can omit this generic parameter, see [stateful graphs](#stateful-graphs) for more information
-   **deps**, which must have the same type as the deps of the graph they're included in, [`DepsT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.DepsT) has a default of `None`, so if you're not using deps you can omit this generic parameter, see [dependency injection](#dependency-injection) for more information
-   **graph return type** -- this only applies if the node returns [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End). [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT) has a default of [Never](https://docs.python.org/3/library/typing.html#typing.Never) so this generic parameter can be omitted if the node doesn't return `End`, but must be included if it does.

Here's an example of a start or intermediate node in a graph -- it can't end the run as it doesn't return [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End):

intermediate\_node.py

```py
from dataclasses import dataclass

from pydantic_graph import BaseNode, GraphRunContext


@dataclass
class MyNode(BaseNode[MyState]):  # (1)
  foo: int  # (2)

  async def run(
      self,
      ctx: GraphRunContext[MyState],  # (3)
  ) -> AnotherNode:  # (4)
      ...
      return AnotherNode()
```

State in this example is `MyState` (not shown), hence `BaseNode` is parameterized with `MyState`. This node can't end the run, so the `RunEndT` generic parameter is omitted and defaults to `Never`.

`MyNode` is a dataclass and has a single field `foo`, an `int`.

The `run` method takes a `GraphRunContext` parameter, again parameterized with state `MyState`.

The return type of the `run` method is `AnotherNode` (not shown), this is used to determine the outgoing edges of the node.

We could extend `MyNode` to optionally end the run if `foo` is divisible by 5:

intermediate\_or\_end\_node.py

```py
from dataclasses import dataclass

from pydantic_graph import BaseNode, End, GraphRunContext


@dataclass
class MyNode(BaseNode[MyState, None, int]):  # (1)
  foo: int

  async def run(
      self,
      ctx: GraphRunContext[MyState],
  ) -> AnotherNode | End[int]:  # (2)
      if self.foo % 5 == 0:
          return End(self.foo)
      else:
          return AnotherNode()
```

We parameterize the node with the return type (`int` in this case) as well as state. Because generic parameters are positional-only, we have to include `None` as the second parameter representing deps.

The return type of the `run` method is now a union of `AnotherNode` and `End[int]`, this allows the node to end the run if `foo` is divisible by 5.

### Graph

[`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) -- this is the execution graph itself, made up of a set of [node classes](#nodes) (i.e., `BaseNode` subclasses).

`Graph` is generic in:

-   **state** the state type of the graph, [`StateT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.StateT)
-   **deps** the deps type of the graph, [`DepsT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.DepsT)
-   **graph return type** the return type of the graph run, [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT)

Here's an example of a simple graph:

graph\_example.py

```py
from __future__ import annotations

from dataclasses import dataclass

from pydantic_graph import BaseNode, End, Graph, GraphRunContext


@dataclass
class DivisibleBy5(BaseNode[None, None, int]):  # (1)
  foo: int

  async def run(
      self,
      ctx: GraphRunContext,
  ) -> Increment | End[int]:
      if self.foo % 5 == 0:
          return End(self.foo)
      else:
          return Increment(self.foo)


@dataclass
class Increment(BaseNode):  # (2)
  foo: int

  async def run(self, ctx: GraphRunContext) -> DivisibleBy5:
      return DivisibleBy5(self.foo + 1)


fives_graph = Graph(nodes=[DivisibleBy5, Increment])  # (3)
result = fives_graph.run_sync(DivisibleBy5(4))  # (4)
print(result.output)
#> 5
```

The `DivisibleBy5` node is parameterized with `None` for the state param and `None` for the deps param as this graph doesn't use state or deps, and `int` as it can end the run.

The `Increment` node doesn't return `End`, so the `RunEndT` generic parameter is omitted, state can also be omitted as the graph doesn't use state.

The graph is created with a sequence of nodes.

The graph is run synchronously with [`run_sync`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run_sync). The initial node is `DivisibleBy5(4)`. Because the graph doesn't use external state or deps, we don't pass `state` or `deps`.

_(This example is complete, it can be run "as is")_

A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code:

graph\_example\_diagram.py

```py
from graph_example import DivisibleBy5, fives_graph

fives_graph.mermaid_code(start_node=DivisibleBy5)
```

```mermaid
---
title: fives_graph
---
stateDiagram-v2
  [*] --> DivisibleBy5
  DivisibleBy5 --> Increment
  DivisibleBy5 --> [*]
  Increment --> DivisibleBy5
```

In order to visualize a graph within a `jupyter-notebook`, `IPython.display` needs to be used:

jupyter\_display\_mermaid.py

```python
from graph_example import DivisibleBy5, fives_graph
from IPython.display import Image, display

display(Image(fives_graph.mermaid_image(start_node=DivisibleBy5)))
```

## Stateful Graphs

The "state" concept in `pydantic-graph` provides an optional way to access and mutate an object (often a `dataclass` or Pydantic model) as nodes run in a graph. If you think of Graphs as a production line, then your state is the engine being passed along the line and built up by each node as the graph is run.

`pydantic-graph` provides state persistence, with the state recorded after each node is run. (See [State Persistence](#state-persistence).)

Here's an example of a graph which represents a vending machine where the user may insert coins and select a product to purchase.

vending\_machine.py

```python
from __future__ import annotations

from dataclasses import dataclass

from rich.prompt import Prompt

from pydantic_graph import BaseNode, End, Graph, GraphRunContext


@dataclass
class MachineState:  # (1)
  user_balance: float = 0.0
  product: str | None = None


@dataclass
class InsertCoin(BaseNode[MachineState]):  # (3)
  async def run(self, ctx: GraphRunContext[MachineState]) -> CoinsInserted:  # (16)
      return CoinsInserted(float(Prompt.ask('Insert coins')))  # (4)


@dataclass
class CoinsInserted(BaseNode[MachineState]):
  amount: float  # (5)

  async def run(
      self, ctx: GraphRunContext[MachineState]
  ) -> SelectProduct | Purchase:  # (17)
      ctx.state.user_balance += self.amount  # (6)
      if ctx.state.product is not None:  # (7)
          return Purchase(ctx.state.product)
      else:
          return SelectProduct()


@dataclass
class SelectProduct(BaseNode[MachineState]):
  async def run(self, ctx: GraphRunContext[MachineState]) -> Purchase:
      return Purchase(Prompt.ask('Select product'))


PRODUCT_PRICES = {  # (2)
  'water': 1.25,
  'soda': 1.50,
  'crisps': 1.75,
  'chocolate': 2.00,
}


@dataclass
class Purchase(BaseNode[MachineState, None, None]):  # (18)
  product: str

  async def run(
      self, ctx: GraphRunContext[MachineState]
  ) -> End | InsertCoin | SelectProduct:
      if price := PRODUCT_PRICES.get(self.product):  # (8)
          ctx.state.product = self.product  # (9)
          if ctx.state.user_balance >= price:  # (10)
              ctx.state.user_balance -= price
              return End(None)
          else:
              diff = price - ctx.state.user_balance
              print(f'Not enough money for {self.product}, need {diff:0.2f} more')
              #> Not enough money for crisps, need 0.75 more
              return InsertCoin()  # (11)
      else:
          print(f'No such product: {self.product}, try again')
          return SelectProduct()  # (12)


vending_machine_graph = Graph(  # (13)
  nodes=[InsertCoin, CoinsInserted, SelectProduct, Purchase]
)


async def main():
  state = MachineState()  # (14)
  await vending_machine_graph.run(InsertCoin(), state=state)  # (15)
  print(f'purchase successful item={state.product} change={state.user_balance:0.2f}')
  #> purchase successful item=crisps change=0.25
```

The state of the vending machine is defined as a dataclass with the user's balance and the product they've selected, if any.

A dictionary of products mapped to prices.

The `InsertCoin` node, [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) is parameterized with `MachineState` as that's the state used in this graph.

The `InsertCoin` node prompts the user to insert coins. We keep things simple by just entering a monetary amount as a float. Before you start thinking this is a toy too since it's using [rich's `Prompt.ask`](https://rich.readthedocs.io/en/stable/reference/prompt.html#rich.prompt.PromptBase.ask) within nodes, see [below](#example-human-in-the-loop) for how control flow can be managed when nodes require external input.

The `CoinsInserted` node; again this is a [`dataclass`](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass) with one field `amount`.

Update the user's balance with the amount inserted.

If the user has already selected a product, go to `Purchase`, otherwise go to `SelectProduct`.

In the `Purchase` node, look up the price of the product if the user entered a valid product.

If the user did enter a valid product, set the product in the state so we don't revisit `SelectProduct`.

If the balance is enough to purchase the product, adjust the balance to reflect the purchase and return [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) to end the graph. We're not using the run return type, so we call `End` with `None`.

If the balance is insufficient, go to `InsertCoin` to prompt the user to insert more coins.

If the product is invalid, go to `SelectProduct` to prompt the user to select a product again.

The graph is created by passing a list of nodes to [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph). Order of nodes is not important, but it can affect how [diagrams](#mermaid-diagrams) are displayed.

Initialize the state. This will be passed to the graph run and mutated as the graph runs.

Run the graph with the initial state. Since the graph can be run from any node, we must pass the start node -- in this case, `InsertCoin`. [`Graph.run`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) returns a [`GraphRunResult`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRunResult) that provides the final data and a history of the run.

The return type of the node's [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method is important as it is used to determine the outgoing edges of the node. This information in turn is used to render [mermaid diagrams](#mermaid-diagrams) and is enforced at runtime to detect misbehavior as soon as possible.

The return type of `CoinsInserted`'s [`run`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.run) method is a union, meaning multiple outgoing edges are possible.

Unlike other nodes, `Purchase` can end the run, so the [`RunEndT`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.RunEndT) generic parameter must be set. In this case it's `None` since the graph run return type is `None`.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code:

vending\_machine\_diagram.py

```py
from vending_machine import InsertCoin, vending_machine_graph

vending_machine_graph.mermaid_code(start_node=InsertCoin)
```

The diagram generated by the above code is:

```mermaid
---
title: vending_machine_graph
---
stateDiagram-v2
  [*] --> InsertCoin
  InsertCoin --> CoinsInserted
  CoinsInserted --> SelectProduct
  CoinsInserted --> Purchase
  SelectProduct --> Purchase
  Purchase --> InsertCoin
  Purchase --> SelectProduct
  Purchase --> [*]
```

See [below](#mermaid-diagrams) for more information on generating diagrams.

## GenAI Example

So far we haven't shown an example of a Graph that actually uses Pydantic AI or GenAI at all.

In this example, one agent generates a welcome email to a user and the other agent provides feedback on the email.

This graph has a very simple structure:

```mermaid
---
title: feedback_graph
---
stateDiagram-v2
  [*] --> WriteEmail
  WriteEmail --> Feedback
  Feedback --> WriteEmail
  Feedback --> [*]
```

genai\_email\_feedback.py

```python
from __future__ import annotations as _annotations

from dataclasses import dataclass, field

from pydantic import BaseModel, EmailStr

from pydantic_ai import Agent, ModelMessage, format_as_xml
from pydantic_graph import BaseNode, End, Graph, GraphRunContext


@dataclass
class User:
    name: str
    email: EmailStr
    interests: list[str]


@dataclass
class Email:
    subject: str
    body: str


@dataclass
class State:
    user: User
    write_agent_messages: list[ModelMessage] = field(default_factory=list)


email_writer_agent = Agent(
    'google-gla:gemini-3-pro-preview',
    output_type=Email,
    instructions='Write a welcome email to our tech blog.',
)


@dataclass
class WriteEmail(BaseNode[State]):
    email_feedback: str | None = None

    async def run(self, ctx: GraphRunContext[State]) -> Feedback:
        if self.email_feedback:
            prompt = (
                f'Rewrite the email for the user:\n'
                f'{format_as_xml(ctx.state.user)}\n'
                f'Feedback: {self.email_feedback}'
            )
        else:
            prompt = (
                f'Write a welcome email for the user:\n'
                f'{format_as_xml(ctx.state.user)}'
            )

        result = await email_writer_agent.run(
            prompt,
            message_history=ctx.state.write_agent_messages,
        )
        ctx.state.write_agent_messages += result.new_messages()
        return Feedback(result.output)


class EmailRequiresWrite(BaseModel):
    feedback: str


class EmailOk(BaseModel):
    pass


feedback_agent = Agent[None, EmailRequiresWrite | EmailOk](
    'openai:gpt-5.2',
    output_type=EmailRequiresWrite | EmailOk,  # type: ignore
    instructions=(
        'Review the email and provide feedback, email must reference the users specific interests.'
    ),
)


@dataclass
class Feedback(BaseNode[State, None, Email]):
    email: Email

    async def run(
        self,
        ctx: GraphRunContext[State],
    ) -> WriteEmail | End[Email]:
        prompt = format_as_xml({'user': ctx.state.user, 'email': self.email})
        result = await feedback_agent.run(prompt)
        if isinstance(result.output, EmailRequiresWrite):
            return WriteEmail(email_feedback=result.output.feedback)
        else:
            return End(self.email)


async def main():
    user = User(
        name='John Doe',
        email='john.joe@example.com',
        interests=['Haskel', 'Lisp', 'Fortran'],
    )
    state = State(user)
    feedback_graph = Graph(nodes=(WriteEmail, Feedback))
    result = await feedback_graph.run(WriteEmail(), state=state)
    print(result.output)
    """
    Email(
        subject='Welcome to our tech blog!',
        body='Hello John, Welcome to our tech blog! ...',
    )
    """
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Iterating Over a Graph

### Using `Graph.iter` for `async for` iteration

Sometimes you want direct control or insight into each node as the graph executes. The easiest way to do that is with the [`Graph.iter`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter) method, which returns a **context manager** that yields a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun) object. The `GraphRun` is an async-iterable over the nodes of your graph, allowing you to record or modify them as they execute.

Here's an example:

count\_down.py

```python
from __future__ import annotations as _annotations

from dataclasses import dataclass
from pydantic_graph import Graph, BaseNode, End, GraphRunContext


@dataclass
class CountDownState:
  counter: int


@dataclass
class CountDown(BaseNode[CountDownState, None, int]):
  async def run(self, ctx: GraphRunContext[CountDownState]) -> CountDown | End[int]:
      if ctx.state.counter <= 0:
          return End(ctx.state.counter)
      ctx.state.counter -= 1
      return CountDown()


count_down_graph = Graph(nodes=[CountDown])


async def main():
  state = CountDownState(counter=3)
  async with count_down_graph.iter(CountDown(), state=state) as run:  # (1)
      async for node in run:  # (2)
          print('Node:', node)
          #> Node: CountDown()
          #> Node: CountDown()
          #> Node: CountDown()
          #> Node: CountDown()
          #> Node: End(data=0)
  print('Final output:', run.result.output)  # (3)
  #> Final output: 0
```

`Graph.iter(...)` returns a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun).

Here, we step through each node as it is executed.

Once the graph returns an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End), the loop ends, and `run.result` becomes a [`GraphRunResult`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRunResult) containing the final outcome (`0` here).

### Using `GraphRun.next(node)` manually

Alternatively, you can drive iteration manually with the [`GraphRun.next`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) method, which allows you to pass in whichever node you want to run next. You can modify or selectively skip nodes this way.

Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that:

count\_down\_next.py

```python
from pydantic_graph import End, FullStatePersistence
from count_down import CountDown, CountDownState, count_down_graph


async def main():
  state = CountDownState(counter=5)
  persistence = FullStatePersistence()  # (7)
  async with count_down_graph.iter(
      CountDown(), state=state, persistence=persistence
  ) as run:
      node = run.next_node  # (1)
      while not isinstance(node, End):  # (2)
          print('Node:', node)
          #> Node: CountDown()
          #> Node: CountDown()
          #> Node: CountDown()
          #> Node: CountDown()
          if state.counter == 2:
              break  # (3)
          node = await run.next(node)  # (4)

      print(run.result)  # (5)
      #> None

      for step in persistence.history:  # (6)
          print('History Step:', step.state, step.state)
          #> History Step: CountDownState(counter=5) CountDownState(counter=5)
          #> History Step: CountDownState(counter=4) CountDownState(counter=4)
          #> History Step: CountDownState(counter=3) CountDownState(counter=3)
          #> History Step: CountDownState(counter=2) CountDownState(counter=2)
```

We start by grabbing the first node that will be run in the agent's graph.

The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`.

If the user decides to stop early, we break out of the loop. The graph run won't have a real final result in that case (`run.result` remains `None`).

At each step, we call `await run.next(node)` to run it and get the next node (or an `End`).

Because we did not continue the run until it finished, the `result` is not set.

The run's history is still populated with the steps we executed so far.

Use [`FullStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.FullStatePersistence) so we can show the history of the run, see [State Persistence](#state-persistence) below for more information.

## State Persistence

One of the biggest benefits of finite state machine (FSM) graphs is how they simplify the handling of interrupted execution. This might happen for a variety of reasons:

-   the state machine logic might fundamentally need to be paused -- e.g. the returns workflow for an e-commerce order needs to wait for the item to be posted to the returns center or because execution of the next node needs input from a user so needs to wait for a new http request,
-   the execution takes so long that the entire graph can't reliably be executed in a single continuous run -- e.g. a deep research agent that might take hours to run,
-   you want to run multiple graph nodes in parallel in different processes / hardware instances (note: parallel node execution is not yet supported in `pydantic-graph`, see [#704](https://github.com/pydantic/pydantic-ai/issues/704)).

Trying to make a conventional control flow (i.e., boolean logic and nested function calls) implementation compatible with these usage scenarios generally results in brittle and over-complicated spaghetti code, with the logic required to interrupt and resume execution dominating the implementation.

To allow graph runs to be interrupted and resumed, `pydantic-graph` provides state persistence -- a system for snapshotting the state of a graph run before and after each node is run, allowing a graph run to be resumed from any point in the graph.

`pydantic-graph` includes three state persistence implementations:

-   [`SimpleStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.SimpleStatePersistence) -- Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default.
-   [`FullStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.in_mem.FullStatePersistence) -- In memory state persistence that hold a list of snapshots.
-   [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence) -- File-based state persistence that saves snapshots to a JSON file.

In production applications, developers should implement their own state persistence by subclassing [`BaseStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence) abstract base class, which might persist runs in a relational database like PostgresQL.

At a high level the role of `StatePersistence` implementations is to store and retrieve [`NodeSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.NodeSnapshot) and [`EndSnapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.EndSnapshot) objects.

[`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) may be used to run the graph based on the state stored in persistence.

We can run the `count_down_graph` from [above](#iterating-over-a-graph), using [`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) and [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence).

As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems.

count\_down\_from\_persistence.py

```python
from pathlib import Path

from pydantic_graph import End
from pydantic_graph.persistence.file import FileStatePersistence

from count_down import CountDown, CountDownState, count_down_graph


async def main():
  run_id = 'run_abc123'
  persistence = FileStatePersistence(Path(f'count_down_{run_id}.json'))  # (1)
  state = CountDownState(counter=5)
  await count_down_graph.initialize(  # (2)
      CountDown(), state=state, persistence=persistence
  )

  done = False
  while not done:
      done = await run_node(run_id)


async def run_node(run_id: str) -> bool:  # (3)
  persistence = FileStatePersistence(Path(f'count_down_{run_id}.json'))
  async with count_down_graph.iter_from_persistence(persistence) as run:  # (4)
      node_or_end = await run.next()  # (5)

  print('Node:', node_or_end)
  #> Node: CountDown()
  #> Node: CountDown()
  #> Node: CountDown()
  #> Node: CountDown()
  #> Node: CountDown()
  #> Node: End(data=0)
  return isinstance(node_or_end, End)  # (6)
```

Create a [`FileStatePersistence`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.file.FileStatePersistence) to use to start the graph.

Call [`graph.initialize()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.initialize) to set the initial graph state in the persistence object.

`run_node` is a pure function that doesn't need access to any other process state to run the next node of the graph, except the ID of the run.

Call [`graph.iter_from_persistence()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.iter_from_persistence) create a [`GraphRun`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun) object that will run the next node of the graph from the state stored in persistence. This will return either a node or an `End` object.

[`graph.run()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.run) will return either a [node](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode) or an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) object.

Check if the node is an [`End`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.End) object, if it is, the graph run is complete.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### Example: Human in the loop.

As noted above, state persistence allows graphs to be interrupted and resumed. One use case of this is to allow user input to continue.

In this example, an AI asks the user a question, the user provides an answer, the AI evaluates the answer and ends if the user got it right or asks another question if they got it wrong.

Instead of running the entire graph in a single process invocation, we run the graph by running the process repeatedly, optionally providing an answer to the question as a command line argument.

`ai_q_and_a_graph.py` -- `question_graph` definition

ai\_q\_and\_a\_graph.py

```python
from __future__ import annotations as _annotations

from typing import Annotated
from pydantic_graph import Edge
from dataclasses import dataclass, field
from pydantic import BaseModel
from pydantic_graph import (
    BaseNode,
    End,
    Graph,
    GraphRunContext,
)
from pydantic_ai import Agent, format_as_xml
from pydantic_ai import ModelMessage

ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True)


@dataclass
class QuestionState:
    question: str | None = None
    ask_agent_messages: list[ModelMessage] = field(default_factory=list)
    evaluate_agent_messages: list[ModelMessage] = field(default_factory=list)


@dataclass
class Ask(BaseNode[QuestionState]):
    """Generate question using GPT-5."""
    docstring_notes = True
    async def run(
        self, ctx: GraphRunContext[QuestionState]
    ) -> Annotated[Answer, Edge(label='Ask the question')]:
        result = await ask_agent.run(
            'Ask a simple question with a single correct answer.',
            message_history=ctx.state.ask_agent_messages,
        )
        ctx.state.ask_agent_messages += result.new_messages()
        ctx.state.question = result.output
        return Answer(result.output)


@dataclass
class Answer(BaseNode[QuestionState]):
    question: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate:
        answer = input(f'{self.question}: ')
        return Evaluate(answer)


class EvaluationResult(BaseModel, use_attribute_docstrings=True):
    correct: bool
    """Whether the answer is correct."""
    comment: str
    """Comment on the answer, reprimand the user if the answer is wrong."""


evaluate_agent = Agent(
    'openai:gpt-5.2',
    output_type=EvaluationResult,
    instructions='Given a question and answer, evaluate if the answer is correct.',
)


@dataclass
class Evaluate(BaseNode[QuestionState, None, str]):
    answer: str

    async def run(
        self,
        ctx: GraphRunContext[QuestionState],
    ) -> Annotated[End[str], Edge(label='success')] | Reprimand:
        assert ctx.state.question is not None
        result = await evaluate_agent.run(
            format_as_xml({'question': ctx.state.question, 'answer': self.answer}),
            message_history=ctx.state.evaluate_agent_messages,
        )
        ctx.state.evaluate_agent_messages += result.new_messages()
        if result.output.correct:
            return End(result.output.comment)
        else:
            return Reprimand(result.output.comment)


@dataclass
class Reprimand(BaseNode[QuestionState]):
    comment: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask:
        print(f'Comment: {self.comment}')
        ctx.state.question = None
        return Ask()


question_graph = Graph(
    nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState
)
```

_(This example is complete, it can be run "as is")_

ai\_q\_and\_a\_run.py

```python
import sys
from pathlib import Path

from pydantic_graph import End
from pydantic_graph.persistence.file import FileStatePersistence
from pydantic_ai import ModelMessage  # noqa: F401

from ai_q_and_a_graph import Ask, question_graph, Evaluate, QuestionState, Answer


async def main():
  answer: str | None = sys.argv[1] if len(sys.argv) > 1 else None  # (1)
  persistence = FileStatePersistence(Path('question_graph.json'))  # (2)
  persistence.set_graph_types(question_graph)  # (3)

  if snapshot := await persistence.load_next():  # (4)
      state = snapshot.state
      assert answer is not None
      node = Evaluate(answer)
  else:
      state = QuestionState()
      node = Ask()  # (5)

  async with question_graph.iter(node, state=state, persistence=persistence) as run:
      while True:
          node = await run.next()  # (6)
          if isinstance(node, End):  # (7)
              print('END:', node.data)
              history = await persistence.load_all()  # (8)
              print([e.node for e in history])
              break
          elif isinstance(node, Answer):  # (9)
              print(node.question)
              #> What is the capital of France?
              break
          # otherwise just continue
```

Get the user's answer from the command line, if provided. See [question graph example](/docs/ai/examples/question-graph) for a complete example.

Create a state persistence instance the `'question_graph.json'` file may or may not already exist.

Since we're using the [persistence interface](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence) outside a graph, we need to call [`set_graph_types`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.set_graph_types) to set the graph generic types `StateT` and `RunEndT` for the persistence instance. This is necessary to allow the persistence instance to know how to serialize and deserialize graph nodes.

If we're run the graph before, [`load_next`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_next) will return a snapshot of the next node to run, here we use `state` from that snapshot, and create a new `Evaluate` node with the answer provided on the command line.

If the graph hasn't been run before, we create a new `QuestionState` and start with the `Ask` node.

Call [`GraphRun.next()`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.GraphRun.next) to run the node. This will return either a node or an `End` object.

If the node is an `End` object, the graph run is complete. The `data` field of the `End` object contains the comment returned by the `evaluate_agent` about the correct answer.

To demonstrate the state persistence, we call [`load_all`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.BaseStatePersistence.load_all) to get all the snapshots from the persistence instance. This will return a list of [`Snapshot`](/docs/ai/api/pydantic_graph/persistence/#pydantic_graph.persistence.Snapshot) objects.

If the node is an `Answer` object, we print the question and break out of the loop to end the process and wait for user input.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

For a complete example of this graph, see the [question graph example](/docs/ai/examples/question-graph).

## Dependency Injection

As with Pydantic AI, `pydantic-graph` supports dependency injection via a generic parameter on [`Graph`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph) and [`BaseNode`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode), and the [`GraphRunContext.deps`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.GraphRunContext.deps) field.

As an example of dependency injection, let's modify the `DivisibleBy5` example [above](#graph) to use a [`ProcessPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor) to run the compute load in a separate process (this is a contrived example, `ProcessPoolExecutor` wouldn't actually improve performance in this example):

deps\_example.py

```py
from __future__ import annotations

import asyncio
from concurrent.futures import ProcessPoolExecutor
from dataclasses import dataclass

from pydantic_graph import BaseNode, End, FullStatePersistence, Graph, GraphRunContext


@dataclass
class GraphDeps:
    executor: ProcessPoolExecutor


@dataclass
class DivisibleBy5(BaseNode[None, GraphDeps, int]):
    foo: int

    async def run(
        self,
        ctx: GraphRunContext[None, GraphDeps],
    ) -> Increment | End[int]:
        if self.foo % 5 == 0:
            return End(self.foo)
        else:
            return Increment(self.foo)


@dataclass
class Increment(BaseNode[None, GraphDeps]):
    foo: int

    async def run(self, ctx: GraphRunContext[None, GraphDeps]) -> DivisibleBy5:
        loop = asyncio.get_running_loop()
        compute_result = await loop.run_in_executor(
            ctx.deps.executor,
            self.compute,
        )
        return DivisibleBy5(compute_result)

    def compute(self) -> int:
        return self.foo + 1


fives_graph = Graph(nodes=[DivisibleBy5, Increment])


async def main():
    with ProcessPoolExecutor() as executor:
        deps = GraphDeps(executor)
        result = await fives_graph.run(DivisibleBy5(3), deps=deps, persistence=FullStatePersistence())
    print(result.output)
    #> 5
    # the full history is quite verbose (see below), so we'll just print the summary
    print([item.node for item in result.persistence.history])
    """
    [
        DivisibleBy5(foo=3),
        Increment(foo=3),
        DivisibleBy5(foo=4),
        Increment(foo=4),
        DivisibleBy5(foo=5),
        End(data=5),
    ]
    """
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Mermaid Diagrams

Pydantic Graph can generate [mermaid](https://mermaid.js.org/) [`stateDiagram-v2`](https://mermaid.js.org/syntax/stateDiagram.html) diagrams for graphs, as shown above.

These diagrams can be generated with:

-   [`Graph.mermaid_code`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) to generate the mermaid code for a graph
-   [`Graph.mermaid_image`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_image) to generate an image of the graph using [mermaid.ink](https://mermaid.ink/)
-   [`Graph.mermaid_save`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_save) to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) and save it to a file

Beyond the diagrams shown above, you can also customize mermaid diagrams with the following options:

-   [`Edge`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.Edge) allows you to apply a label to an edge
-   [`BaseNode.docstring_notes`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.docstring_notes) and [`BaseNode.get_note`](/docs/ai/api/pydantic_graph/nodes/#pydantic_graph.nodes.BaseNode.get_note) allows you to add notes to nodes
-   The [`highlighted_nodes`](/docs/ai/api/pydantic_graph/graph/#pydantic_graph.graph.Graph.mermaid_code) parameter allows you to highlight specific node(s) in the diagram

Putting that together, we can edit the last [`ai_q_and_a_graph.py`](#example-human-in-the-loop) example to:

-   add labels to some edges
-   add a note to the `Ask` node
-   highlight the `Answer` node
-   save the diagram as a `PNG` image to file

ai\_q\_and\_a\_graph\_extra.py

```python
from typing import Annotated

from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge

ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True)


@dataclass
class QuestionState:
    question: str | None = None
    ask_agent_messages: list[ModelMessage] = field(default_factory=list)
    evaluate_agent_messages: list[ModelMessage] = field(default_factory=list)


@dataclass
class Ask(BaseNode[QuestionState]):
    """Generate question using GPT-5."""
    docstring_notes = True
    async def run(
        self, ctx: GraphRunContext[QuestionState]
    ) -> Annotated[Answer, Edge(label='Ask the question')]:
        result = await ask_agent.run(
            'Ask a simple question with a single correct answer.',
            message_history=ctx.state.ask_agent_messages,
        )
        ctx.state.ask_agent_messages += result.new_messages()
        ctx.state.question = result.output
        return Answer(result.output)


@dataclass
class Answer(BaseNode[QuestionState]):
    question: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate:
        answer = input(f'{self.question}: ')
        return Evaluate(answer)


class EvaluationResult(BaseModel, use_attribute_docstrings=True):
    correct: bool
    """Whether the answer is correct."""
    comment: str
    """Comment on the answer, reprimand the user if the answer is wrong."""


evaluate_agent = Agent(
    'openai:gpt-5.2',
    output_type=EvaluationResult,
    instructions='Given a question and answer, evaluate if the answer is correct.',
)


@dataclass
class Evaluate(BaseNode[QuestionState, None, str]):
    answer: str

    async def run(
        self,
        ctx: GraphRunContext[QuestionState],
    ) -> Annotated[End[str], Edge(label='success')] | Reprimand:
        assert ctx.state.question is not None
        result = await evaluate_agent.run(
            format_as_xml({'question': ctx.state.question, 'answer': self.answer}),
            message_history=ctx.state.evaluate_agent_messages,
        )
        ctx.state.evaluate_agent_messages += result.new_messages()
        if result.output.correct:
            return End(result.output.comment)
        else:
            return Reprimand(result.output.comment)


@dataclass
class Reprimand(BaseNode[QuestionState]):
    comment: str

    async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask:
        print(f'Comment: {self.comment}')
        ctx.state.question = None
        return Ask()


question_graph = Graph(
    nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState
)
```

_(This example is not complete and cannot be run directly)_

This would generate an image that looks like this:

```mermaid
---
title: question_graph
---
stateDiagram-v2
  Ask --> Answer: Ask the question
  note right of Ask
    Judge the answer.
    Decide on next step.
  end note
  Answer --> Evaluate
  Evaluate --> Reprimand
  Evaluate --> [*]: success
  Reprimand --> Ask

classDef highlighted fill:#fdff32
class Answer highlighted
```

### Setting Direction of the State Diagram

You can specify the direction of the state diagram using one of the following values:

-   `'TB'`: Top to bottom, the diagram flows vertically from top to bottom.
-   `'LR'`: Left to right, the diagram flows horizontally from left to right.
-   `'RL'`: Right to left, the diagram flows horizontally from right to left.
-   `'BT'`: Bottom to top, the diagram flows vertically from bottom to top.

Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB):

vending\_machine\_diagram.py

```py
from vending_machine import InsertCoin, vending_machine_graph

vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR')
```

```mermaid
---
title: vending_machine_graph
---
stateDiagram-v2
  direction LR
  [*] --> InsertCoin
  InsertCoin --> CoinsInserted
  CoinsInserted --> SelectProduct
  CoinsInserted --> Purchase
  SelectProduct --> Purchase
  Purchase --> InsertCoin
  Purchase --> SelectProduct
  Purchase --> [*]
```

---

# [Embeddings](https://pydantic.dev/docs/ai/guides/embeddings/)

# Embeddings

Embeddings are vector representations of text that capture semantic meaning. They're essential for building:

-   **Semantic search** -- Find documents based on meaning, not just keyword matching
-   **RAG (Retrieval-Augmented Generation)** -- Retrieve relevant context for your AI agents
-   **Similarity detection** -- Find similar documents, detect duplicates, or cluster content
-   **Classification** -- Use embeddings as features for downstream ML models

Pydantic AI provides a unified interface for generating embeddings across multiple providers.

## Quick Start

The [`Embedder`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder) class is the high-level interface for generating embeddings:

embeddings\_quickstart.py

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    # Embed a search query
    result = await embedder.embed_query('What is machine learning?')
    print(f'Embedding dimensions: {len(result.embeddings[0])}')
    #> Embedding dimensions: 1536

    # Embed multiple documents at once
    docs = [
        'Machine learning is a subset of AI.',
        'Deep learning uses neural networks.',
        'Python is a programming language.',
    ]
    result = await embedder.embed_documents(docs)
    print(f'Embedded {len(result.embeddings)} documents')
    #> Embedded 3 documents
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

Queries vs Documents

Some embedding models optimize differently for queries and documents. Use [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query) for search queries and [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents) for content you're indexing.

## Embedding Result

All embed methods return an [`EmbeddingResult`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingResult) containing the embeddings along with useful metadata.

For convenience, you can access embeddings either by index (`result[0]`) or by the original input text (`result['Hello world']`).

embedding\_result.py

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    result = await embedder.embed_query('Hello world')

    # Access embeddings - each is a sequence of floats
    embedding = result.embeddings[0]  # By index via .embeddings
    embedding = result[0]  # Or directly via __getitem__
    embedding = result['Hello world']  # Or by original input text
    print(f'Dimensions: {len(embedding)}')
    #> Dimensions: 1536

    # Check usage
    print(f'Tokens used: {result.usage.input_tokens}')
    #> Tokens used: 2

    # Calculate cost (requires `genai-prices` to have pricing data for the model)
    cost = result.cost()
    print(f'Cost: ${cost.total_price:.6f}')
    #> Cost: $0.000000
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Choosing a model

The best embedding model depends on your constraints. Here's a starting-point cheat sheet; consult each provider's docs and the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) before committing to a model for a large index.

If you want...

For example

A managed API

`openai:text-embedding-3-small` (cheap default), `openai:text-embedding-3-large`, `voyageai:voyage-3.5`, or `cohere:embed-v4.0`

No API key, private, free

`sentence-transformers:google/embeddinggemma-300m`, `sentence-transformers:lightonai/DenseOn`, `sentence-transformers:Qwen/Qwen3-Embedding-0.6B`, or any other [Hugging Face model](https://huggingface.co/models?library=sentence-transformers)

Multilingual

`cohere:embed-multilingual-v3.0`, `sentence-transformers:jinaai/jina-embeddings-v5-text-small-retrieval`, or `sentence-transformers:Snowflake/snowflake-arctic-embed-l-v2.0`

Specialized domain

`voyageai:voyage-code-3`, `voyageai:voyage-law-2`, `voyageai:voyage-finance-2`, `sentence-transformers:nomic-ai/CodeRankEmbed`, or `sentence-transformers:TechWolf/JobBERT-v3`

To run on AWS infra you already have

`bedrock:amazon.titan-embed-text-v2:0` or `bedrock:cohere.embed-v4:0`

To reduce index size

Any model with dimension control (see [Settings](#settings))

Switching models later

Swapping a model changes the output dimension and the similarity distribution, so you'll need to re-embed (and re-index) your documents. Pick a model you're happy to stick with, or one that supports [dimension control](#settings) so you can tune the index size without changing models.

## Providers

### OpenAI

[`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) works with OpenAI's embeddings API and any [OpenAI-compatible provider](/docs/ai/models/openai#openai-compatible-models).

#### Install

To use OpenAI embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group:

-   [pip](#tab-panel-26)
-   [uv](#tab-panel-27)

Terminal

```bash
pip install "pydantic-ai-slim[openai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openai]"
```

#### Configuration

To use `OpenAIEmbeddingModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export OPENAI_API_KEY='your-api-key'
```

You can then use the model:

openai\_embeddings.py

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 1536
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available models.

#### Dimension Control

OpenAI's `text-embedding-3-*` models support dimension reduction via the `dimensions` setting:

openai\_dimensions.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import EmbeddingSettings

embedder = Embedder(
    'openai:text-embedding-3-small',
    settings=EmbeddingSettings(dimensions=256),
)


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 256
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

#### OpenAI-Compatible Providers

Since [`OpenAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.openai.OpenAIEmbeddingModel) uses the same provider system as [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), you can use it with any [OpenAI-compatible provider](/docs/ai/models/openai#openai-compatible-models):

openai\_compatible\_embeddings.py

```python
# Using Azure OpenAI
from openai import AsyncAzureOpenAI

from pydantic_ai import Embedder
from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel
from pydantic_ai.providers.openai import OpenAIProvider

azure_client = AsyncAzureOpenAI(
    azure_endpoint='https://your-resource.openai.azure.com',
    api_version='2024-02-01',
    api_key='your-azure-key',
)
model = OpenAIEmbeddingModel(
    'text-embedding-3-small',
    provider=OpenAIProvider(openai_client=azure_client),
)
embedder = Embedder(model)


# Using any OpenAI-compatible API
model = OpenAIEmbeddingModel(
    'your-model-name',
    provider=OpenAIProvider(
        base_url='https://your-provider.com/v1',
        api_key='your-api-key',
    ),
)
embedder = Embedder(model)
```

For providers with dedicated provider classes (like [`OllamaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ollama.OllamaProvider) or [`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider)), you can use the shorthand syntax:

```python
from pydantic_ai import Embedder

embedder = Embedder('azure:text-embedding-3-small')
embedder = Embedder('ollama:nomic-embed-text')
```

See [OpenAI-compatible Models](/docs/ai/models/openai#openai-compatible-models) for the full list of supported providers.

### Google

[`GoogleEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingModel) works with Google's embedding models via the Gemini API (Google AI Studio) or Vertex AI.

#### Install

To use Google embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group:

-   [pip](#tab-panel-28)
-   [uv](#tab-panel-29)

Terminal

```bash
pip install "pydantic-ai-slim[google]"
```

Terminal

```bash
uv add "pydantic-ai-slim[google]"
```

#### Configuration

To use `GoogleEmbeddingModel` with the Gemini API, go to [aistudio.google.com](https://aistudio.google.com/) and generate an API key. Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export GOOGLE_API_KEY='your-api-key'
```

You can then use the model:

google\_embeddings.py

```python
from pydantic_ai import Embedder

embedder = Embedder('google-gla:gemini-embedding-001')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 3072
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models.

##### Vertex AI

To use Google's embedding models via Vertex AI instead of the Gemini API, use the `google-vertex` provider prefix:

google\_vertex\_embeddings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.google import GoogleEmbeddingModel
from pydantic_ai.providers.google import GoogleProvider

# Using provider prefix
embedder = Embedder('google-vertex:gemini-embedding-001')

# Or with explicit provider configuration
model = GoogleEmbeddingModel(
    'gemini-embedding-001',
    provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'),
)
embedder = Embedder(model)
```

See the [Google provider documentation](/docs/ai/models/google#vertex-ai-enterprisecloud) for more details on Vertex AI authentication options, including application default credentials, service accounts, and API keys.

#### Dimension Control

Google's embedding models support dimension reduction via the `dimensions` setting:

google\_dimensions.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import EmbeddingSettings

embedder = Embedder(
    'google-gla:gemini-embedding-001',
    settings=EmbeddingSettings(dimensions=768),
)


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 768
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

#### Google-Specific Settings

Google models support additional settings via [`GoogleEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.google.GoogleEmbeddingSettings):

google\_settings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.google import GoogleEmbeddingSettings

embedder = Embedder(
    'google-gla:gemini-embedding-001',
    settings=GoogleEmbeddingSettings(
        dimensions=768,
        google_task_type='SEMANTIC_SIMILARITY',  # Optimize for similarity comparison
    ),
)
```

See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available task types. By default, `embed_query()` uses `RETRIEVAL_QUERY` and `embed_documents()` uses `RETRIEVAL_DOCUMENT`.

### Cohere

[`CohereEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingModel) provides access to Cohere's embedding models, which offer multilingual support and various model sizes.

#### Install

To use Cohere embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group:

-   [pip](#tab-panel-30)
-   [uv](#tab-panel-31)

Terminal

```bash
pip install "pydantic-ai-slim[cohere]"
```

Terminal

```bash
uv add "pydantic-ai-slim[cohere]"
```

#### Configuration

To use `CohereEmbeddingModel`, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export CO_API_KEY='your-api-key'
```

You can then use the model:

cohere\_embeddings.py

```python
from pydantic_ai import Embedder

embedder = Embedder('cohere:embed-v4.0')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 1024
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models.

#### Cohere-Specific Settings

Cohere models support additional settings via [`CohereEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.cohere.CohereEmbeddingSettings):

cohere\_settings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.cohere import CohereEmbeddingSettings

embedder = Embedder(
    'cohere:embed-v4.0',
    settings=CohereEmbeddingSettings(
        dimensions=512,
        cohere_truncate='END',  # Truncate long inputs instead of erroring
        cohere_max_tokens=256,  # Limit tokens per input
    ),
)
```

### VoyageAI

[`VoyageAIEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingModel) provides access to VoyageAI's embedding models, which are optimized for retrieval with specialized models for code, finance, and legal domains.

#### Install

To use VoyageAI embedding models, you need to install `pydantic-ai-slim` with the `voyageai` optional group:

-   [pip](#tab-panel-32)
-   [uv](#tab-panel-33)

Terminal

```bash
pip install "pydantic-ai-slim[voyageai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[voyageai]"
```

#### Configuration

To use `VoyageAIEmbeddingModel`, go to [dash.voyageai.com](https://dash.voyageai.com/) to generate an API key. Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export VOYAGE_API_KEY='your-api-key'
```

You can then use the model:

voyageai\_embeddings.py

```python
from pydantic_ai import Embedder

embedder = Embedder('voyageai:voyage-3.5')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 1024
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

See the [VoyageAI Embeddings documentation](https://docs.voyageai.com/docs/embeddings) for available models.

#### VoyageAI-Specific Settings

VoyageAI models support additional settings via [`VoyageAIEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.voyageai.VoyageAIEmbeddingSettings):

voyageai\_settings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingSettings

embedder = Embedder(
    'voyageai:voyage-3.5',
    settings=VoyageAIEmbeddingSettings(
        dimensions=512,  # Reduce output dimensions
        voyageai_input_type='document',  # Override input type for all requests
    ),
)
```

### Bedrock

[`BedrockEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel) provides access to embedding models through AWS Bedrock, including Amazon Titan, Cohere, and Amazon Nova models.

#### Install

To use Bedrock embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group:

-   [pip](#tab-panel-34)
-   [uv](#tab-panel-35)

Terminal

```bash
pip install "pydantic-ai-slim[bedrock]"
```

Terminal

```bash
uv add "pydantic-ai-slim[bedrock]"
```

#### Configuration

Authentication with AWS Bedrock uses standard AWS credentials. See the [Bedrock provider documentation](/docs/ai/models/bedrock#environment-variables) for details on configuring credentials via environment variables, AWS credentials file, or IAM roles.

Ensure your AWS account has access to the Bedrock embedding models you want to use. See [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) for details.

#### Basic Usage

bedrock\_embeddings.py

```python
from pydantic_ai import Embedder

# Using Amazon Titan
embedder = Embedder('bedrock:amazon.titan-embed-text-v2:0')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 1024
```

_(This example requires AWS credentials configured)_

#### Supported Models

Bedrock supports three families of embedding models. See the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for the full list of available models.

**Amazon Titan:**

-   `amazon.titan-embed-text-v1` -- 1536 dimensions (fixed), 8K tokens
-   `amazon.titan-embed-text-v2:0` -- 256/384/1024 dimensions (configurable, default: 1024), 8K tokens

**Cohere Embed:**

-   `cohere.embed-english-v3` -- English-only, 1024 dimensions (fixed), 512 tokens
-   `cohere.embed-multilingual-v3` -- Multilingual, 1024 dimensions (fixed), 512 tokens
-   `cohere.embed-v4:0` -- 256/512/1024/1536 dimensions (configurable, default: 1536), 128K tokens

**Amazon Nova:**

-   `amazon.nova-2-multimodal-embeddings-v1:0` -- 256/384/1024/3072 dimensions (configurable, default: 3072), 8K tokens

#### Titan-Specific Settings

Titan v2 supports vector normalization for direct similarity calculations via `bedrock_titan_normalize` (default: `True`). Titan v1 does not support this setting.

bedrock\_titan.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings

embedder = Embedder(
    'bedrock:amazon.titan-embed-text-v2:0',
    settings=BedrockEmbeddingSettings(
        dimensions=512,
        bedrock_titan_normalize=True,
    ),
)
```

Note

Titan models do not support the `truncate` setting. The `dimensions` setting is only supported by Titan v2.

#### Cohere-Specific Settings

Cohere models on Bedrock support additional settings via [`BedrockEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings):

-   `bedrock_cohere_input_type` -- By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`.
-   `bedrock_cohere_truncate` -- Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting.
-   `bedrock_cohere_max_tokens` -- Limits tokens per input (default: 128000). Only supported by Cohere v4.

bedrock\_cohere.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings

embedder = Embedder(
    'bedrock:cohere.embed-v4:0',
    settings=BedrockEmbeddingSettings(
        dimensions=512,
        bedrock_cohere_max_tokens=1000,
        bedrock_cohere_truncate='END',
    ),
)
```

Note

The `dimensions` and `bedrock_cohere_max_tokens` settings are only supported by Cohere v4. Cohere v3 models have fixed 1024 dimensions.

#### Nova-Specific Settings

Nova models on Bedrock support additional settings via [`BedrockEmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings):

-   `bedrock_nova_truncate` -- Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting.
-   `bedrock_nova_embedding_purpose` -- By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`.

bedrock\_nova.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings

embedder = Embedder(
    'bedrock:amazon.nova-2-multimodal-embeddings-v1:0',
    settings=BedrockEmbeddingSettings(
        dimensions=1024,
        bedrock_nova_embedding_purpose='TEXT_RETRIEVAL',
        truncate=True,
    ),
)
```

#### Concurrency Settings

Models that don't support batch embedding (Titan and Nova) make individual API requests for each input text. By default, these requests run concurrently with a maximum of 5 parallel requests.

You can adjust this with the `bedrock_max_concurrency` setting:

bedrock\_concurrency.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings

# Increase concurrency for faster throughput
embedder = Embedder(
    'bedrock:amazon.titan-embed-text-v2:0',
    settings=BedrockEmbeddingSettings(bedrock_max_concurrency=10),
)

# Or reduce concurrency to avoid rate limits
embedder = Embedder(
    'bedrock:amazon.nova-2-multimodal-embeddings-v1:0',
    settings=BedrockEmbeddingSettings(bedrock_max_concurrency=2),
)
```

#### Regional Prefixes (Cross-Region Inference)

Bedrock supports cross-region inference using geographic prefixes like `us.`, `eu.`, or `apac.`:

bedrock\_regional.py

```python
from pydantic_ai import Embedder

embedder = Embedder('bedrock:us.amazon.titan-embed-text-v2:0')
```

#### Using AWS Application Inference Profiles

Set [`bedrock_inference_profile`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.bedrock.BedrockEmbeddingSettings.bedrock_inference_profile) to route requests through an inference profile while keeping the base model name for detecting model capabilities:

bedrock\_inference\_profile.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel
from pydantic_ai.providers.bedrock import BedrockProvider

provider = BedrockProvider(region_name='us-east-1')

model = BedrockEmbeddingModel(
    'amazon.titan-embed-text-v2:0',
    provider=provider,
    settings={
        'bedrock_inference_profile': 'arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-embed-profile',
    },
)
embedder = Embedder(model)
```

#### Using a Custom Provider

For advanced configuration like explicit credentials or a custom boto3 client, you can create a [`BedrockProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.bedrock.BedrockProvider) directly. See the [Bedrock provider documentation](/docs/ai/models/bedrock#provider-argument) for more details.

bedrock\_provider.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel
from pydantic_ai.providers.bedrock import BedrockProvider

provider = BedrockProvider(
    region_name='us-west-2',
    aws_access_key_id='your-access-key',
    aws_secret_access_key='your-secret-key',
)

model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0', provider=provider)
embedder = Embedder(model)
```

Token Counting

Bedrock embedding models do not support the `count_tokens()` method because AWS Bedrock's token counting API only works with text generation models (Claude, Llama, etc.), not embedding models. Calling `count_tokens()` will raise `NotImplementedError`.

### Sentence Transformers (Local)

[`SentenceTransformerEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel) runs embeddings locally using the [sentence-transformers](https://www.sbert.net/) library, giving you access to the thousands of [embedding models on Hugging Face](https://huggingface.co/models?library=sentence-transformers) without any API calls. This is ideal for:

-   **Privacy** -- Data never leaves your infrastructure
-   **Cost** -- No API charges for high-volume workloads
-   **Offline use** -- No internet connection required after model download
-   **Specialized domains or languages** - Pick models trained for code, multilingual, biomedical, legal, etc. from the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard)

#### Install

To use Sentence Transformers embedding models, you need to install `pydantic-ai-slim` with the `sentence-transformers` optional group:

-   [pip](#tab-panel-36)
-   [uv](#tab-panel-37)

Terminal

```bash
pip install "pydantic-ai-slim[sentence-transformers]"
```

Terminal

```bash
uv add "pydantic-ai-slim[sentence-transformers]"
```

#### Usage

sentence\_transformers\_embeddings.py

```python
from pydantic_ai import Embedder

# Model is downloaded from Hugging Face on first use
embedder = Embedder('sentence-transformers:lightonai/DenseOn')


async def main():
    result = await embedder.embed_query('Hello world')
    print(len(result.embeddings[0]))
    #> 768
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

[`lightonai/DenseOn`](https://huggingface.co/lightonai/DenseOn) is a strong recent 149M-parameter general-purpose model that encodes queries and documents asymmetrically: [`embed_query()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_query) and [`embed_documents()`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.Embedder.embed_documents) automatically apply the model's `query:` / `document:` prompts. See the [Sentence Transformers pretrained models](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) documentation and the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for more options; see also [Choosing a model](#choosing-a-model) above.

#### Device Selection

Control which device to use for inference:

sentence\_transformers\_device.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings.sentence_transformers import (
    SentenceTransformersEmbeddingSettings,
)

embedder = Embedder(
    'sentence-transformers:sentence-transformers/all-MiniLM-L6-v2',
    settings=SentenceTransformersEmbeddingSettings(
        sentence_transformers_device='cuda',  # Use GPU
        sentence_transformers_normalize_embeddings=True,  # L2 normalize
    ),
)
```

#### Using an Existing Model Instance

If you need more control over model initialization:

sentence\_transformers\_instance.py

```python
from sentence_transformers import SentenceTransformer

from pydantic_ai import Embedder
from pydantic_ai.embeddings.sentence_transformers import (
    SentenceTransformerEmbeddingModel,
)

# Create and configure the model yourself
st_model = SentenceTransformer('microsoft/harrier-oss-v1-270m', device='cpu')

# Wrap it for use with Pydantic AI
model = SentenceTransformerEmbeddingModel(st_model)
embedder = Embedder(model)
```

## Settings

[`EmbeddingSettings`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingSettings) provides common configuration options that work across providers:

-   `dimensions`: Reduce the output embedding dimensions (supported by OpenAI, Google, Cohere, Bedrock, VoyageAI)
-   `truncate`: When `True`, truncate input text that exceeds the model's context length instead of raising an error (supported by Cohere, Bedrock, VoyageAI)

Settings can be specified at the embedder level (applied to all calls) or per-call:

embedding\_settings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import EmbeddingSettings

# Default settings for all calls
embedder = Embedder(
    'openai:text-embedding-3-small',
    settings=EmbeddingSettings(dimensions=512),
)


async def main():
    # Override for a specific call
    result = await embedder.embed_query(
        'Hello world',
        settings=EmbeddingSettings(dimensions=256),
    )
    print(len(result.embeddings[0]))
    #> 256
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Token Counting

You can check token counts before embedding to avoid exceeding model limits:

token\_counting.py

```python
from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')


async def main():
    text = 'Hello world, this is a test.'

    # Count tokens in text
    token_count = await embedder.count_tokens(text)
    print(f'Tokens: {token_count}')
    #> Tokens: 7

    # Check model's maximum input tokens (returns None if unknown)
    max_tokens = await embedder.max_input_tokens()
    print(f'Max tokens: {max_tokens}')
    #> Max tokens: 1024
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Testing

Use [`TestEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.TestEmbeddingModel) for testing without making API calls:

testing\_embeddings.py

```python
from pydantic_ai import Embedder
from pydantic_ai.embeddings import TestEmbeddingModel


async def test_my_rag_system():
    embedder = Embedder('openai:text-embedding-3-small')
    test_model = TestEmbeddingModel()

    with embedder.override(model=test_model):
        result = await embedder.embed_query('test query')

        # TestEmbeddingModel returns deterministic embeddings
        assert result.embeddings[0] == [1.0] * 8

        # Check what settings were used
        assert test_model.last_settings is not None
```

## Instrumentation

Enable OpenTelemetry instrumentation for debugging and monitoring:

instrumented\_embeddings.py

```python
import logfire

from pydantic_ai import Embedder

logfire.configure()

# Instrument a specific embedder
embedder = Embedder('openai:text-embedding-3-small', instrument=True)

# Or instrument all embedders globally
Embedder.instrument_all()
```

See the [Debugging and Monitoring guide](/docs/ai/integrations/logfire) for more details on using Logfire with Pydantic AI.

## Two-stage retrieval with rerankers

For high-quality retrieval, a common pattern is **two-stage**: first use an embedding model to pull a broad shortlist of candidates cheaply, then use a **cross-encoder reranker** to score each candidate against the query more precisely. The cross-encoder reads the query and document _together_, so it's slower than an embedding lookup but dramatically more accurate, making it ideal for narrowing a top-100 recall list down to the top-5 results you actually hand to the LLM.

Pydantic AI does not ship a reranker provider class, so you bring your own. The most common local option is a `CrossEncoder` from `sentence-transformers`:

rerank.py

```python
import asyncio
from functools import cache

from sentence_transformers import CrossEncoder


@cache
def get_reranker() -> CrossEncoder:
    # Loaded lazily on first call, then reused.
    return CrossEncoder('cross-encoder/ms-marco-MiniLM-L6-v2')


async def rerank(query: str, candidates: list[str], top_k: int = 3) -> list[str]:
    """Rerank retrieval candidates by relevance to `query`."""
    reranker = get_reranker()
    # CrossEncoder.rank is blocking, so run it off the event loop.
    ranked = await asyncio.to_thread(
        reranker.rank, query, candidates, top_k=top_k, return_documents=True
    )
    return [item['text'] for item in ranked]
```

Call `rerank()` on the candidates returned by your vector search (for example, in the `retrieve` tool of the [RAG example](/docs/ai/examples/rag)) before handing the results to the LLM.

Managed reranker alternatives

If you'd rather not run a reranker locally, several providers offer hosted rerankers, including [Cohere Rerank](https://docs.cohere.com/docs/rerank-overview), [VoyageAI Rerank](https://docs.voyageai.com/docs/reranker), and [Jina Rerank](https://jina.ai/reranker). Call their HTTP clients or SDKs from a helper function with the same shape as `rerank()` above.

## Building Custom Embedding Models

To integrate a custom embedding provider, subclass [`EmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.EmbeddingModel):

custom\_embedding\_model.py

```python
from collections.abc import Sequence

from pydantic_ai.embeddings import EmbeddingModel, EmbeddingResult, EmbeddingSettings
from pydantic_ai.embeddings.result import EmbedInputType


class MyCustomEmbeddingModel(EmbeddingModel):
    @property
    def model_name(self) -> str:
        return 'my-custom-model'

    @property
    def system(self) -> str:
        return 'my-provider'

    async def embed(
        self,
        inputs: str | Sequence[str],
        *,
        input_type: EmbedInputType,
        settings: EmbeddingSettings | None = None,
    ) -> EmbeddingResult:
        inputs, settings = self.prepare_embed(inputs, settings)

        # Call your embedding API here
        embeddings = [[0.1, 0.2, 0.3] for _ in inputs]  # Placeholder

        return EmbeddingResult(
            embeddings=embeddings,
            inputs=inputs,
            input_type=input_type,
            model_name=self.model_name,
            provider_name=self.system,
        )
```

Use [`WrapperEmbeddingModel`](/docs/ai/api/pydantic-ai/embeddings/#pydantic_ai.embeddings.WrapperEmbeddingModel) if you want to wrap an existing model to add custom behavior like caching or logging.

---

# [Extensibility](https://pydantic.dev/docs/ai/guides/extensibility/)

# Extensibility

Pydantic AI is designed to be extended. [Capabilities](/docs/ai/core-concepts/capabilities) are the primary extension point -- they bundle tools, lifecycle hooks, instructions, and model settings into reusable units that can be shared across agents, packaged as libraries, and loaded from [spec files](/docs/ai/core-concepts/agent-spec).

Beyond capabilities, Pydantic AI provides several other extension mechanisms for specialized needs.

## Capabilities

Capabilities are the recommended way to extend Pydantic AI. They are useful for:

-   **Teams** building reusable internal agent components (guardrails, audit logging, authentication)
-   **Package authors** shipping extensions that work across models and agents
-   **Community contributors** sharing solutions to common problems

See [Capabilities](/docs/ai/core-concepts/capabilities) for using and building capabilities, and [Hooks](/docs/ai/core-concepts/hooks) for the lightweight decorator-based approach.

Tip

If you want to contribute a capability, open an issue on [**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness) rather than on pydantic-ai. Most capabilities belong in the harness -- see [What goes where?](/docs/ai/harness/overview#what-goes-where) for the distinction.

## Publishing capability packages

To make a capability installable and usable in [agent specs](/docs/ai/core-concepts/agent-spec):

1.  **Implement [`get_serialization_name()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_serialization_name)** -- defaults to the class name. Return `None` to opt out of spec support.
    
2.  **Implement [`from_spec()`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.from_spec)** -- defaults to `cls(*args, **kwargs)`. Override when your constructor takes non-serializable types.
    
3.  **Package naming** -- use the `pydantic-ai-` prefix (e.g. `pydantic-ai-guardrails`) so users can find your package.
    
4.  **Registration** -- users pass custom capability types via `custom_capability_types` on [`Agent.from_spec`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_spec) or [`Agent.from_file`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.from_file).
    

```python
from pydantic_ai import Agent

from my_package import MyCapability

agent = Agent.from_file('agent.yaml', custom_capability_types=[MyCapability])
```

See [Custom capabilities in specs](/docs/ai/core-concepts/agent-spec#custom-capabilities-in-specs) for implementation details.

## Pydantic AI Harness

[**Pydantic AI Harness**](/docs/ai/harness/overview) is the official capability library for Pydantic AI -- standalone capabilities like memory, guardrails, and context management live there rather than in core. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full breakdown, or jump to the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix).

## Third-party ecosystem

### Capabilities

[Capabilities](/docs/ai/core-concepts/capabilities) are the recommended extension mechanism for packages that need to bundle tools with hooks, instructions, or model settings. See [Third-party capabilities](/docs/ai/core-concepts/capabilities#third-party-capabilities) for community packages.

### Toolsets

Many third-party extensions are available as [toolsets](/docs/ai/tools-toolsets/toolsets), which can also be wrapped as [capabilities](/docs/ai/core-concepts/capabilities) to take advantage of hooks, instructions, and model settings. See [Third-party toolsets](/docs/ai/tools-toolsets/toolsets#third-party-toolsets) for the full list.

## Other extension points

### Custom toolsets

For specialized tool execution needs (custom transport, tool filtering, execution wrapping), implement [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) or subclass [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset):

-   [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) -- full control over tool definitions and execution
-   [`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) -- delegates to a wrapped toolset, override specific methods

See [Building a Custom Toolset](/docs/ai/tools-toolsets/toolsets#building-a-custom-toolset) for details.

Tip

If your toolset also needs to provide instructions, model settings, or hooks, consider building a [custom capability](/docs/ai/core-concepts/capabilities#building-custom-capabilities) instead.

### Custom models

For connecting to model providers not yet supported by Pydantic AI, implement [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model):

-   [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) -- the base interface for model implementations
-   [`WrapperModel`](/docs/ai/api/models/wrapper/#pydantic_ai.models.wrapper.WrapperModel) -- delegates to a wrapped model, useful for adding instrumentation or transformations

See [Custom Models](/docs/ai/models/overview#custom-models) for details.

### Custom agents

For custom agent behavior, subclass [`AbstractAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent) or [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent):

-   [`AbstractAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent) -- the base interface for agent implementations, providing `run`, `run_sync`, and `run_stream`
-   [`WrapperAgent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.WrapperAgent) -- delegates to a wrapped agent, useful for adding pre/post-processing or context management

---

# [Multi-Agent Patterns](https://pydantic.dev/docs/ai/guides/multi-agent-applications/)

# Multi-Agent Patterns

There are roughly five levels of complexity when building applications with Pydantic AI:

1.  Single agent workflows -- what most of the `pydantic_ai` documentation covers
2.  [Agent delegation](#agent-delegation) -- agents using another agent via tools
3.  [Programmatic agent hand-off](#programmatic-agent-hand-off) -- one agent runs, then application code calls another agent
4.  [Graph based control flow](/docs/ai/graph/) -- for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents
5.  [Deep Agents](#deep-agents) -- autonomous agents with planning, file operations, task delegation, and sandboxed code execution

Of course, you can combine multiple strategies in a single application.

## Agent delegation

"Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes. If you want to hand off control to another agent completely, without coming back to the first agent, you can use an [output function](/docs/ai/core-concepts/output#output-functions).

Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](/docs/ai/core-concepts/dependencies).

You'll generally want to pass [`ctx.usage`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.usage) to the [`usage`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run) keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run.

Multiple models

Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.usage) of the run will not be possible, but you can still use [`UsageLimits`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.UsageLimits) -- including `request_limit`, `total_tokens_limit`, and `tool_calls_limit` -- to avoid unexpected costs or runaway tool loops.

agent\_delegation\_simple.py

```python
from pydantic_ai import Agent, RunContext, UsageLimits

joke_selection_agent = Agent(  # (1)
  'openai:gpt-5.2',
  instructions=(
      'Use the `joke_factory` to generate some jokes, then choose the best. '
      'You must return just a single joke.'
  ),
)
joke_generation_agent = Agent(  # (2)
  'google-gla:gemini-3-flash-preview', output_type=list[str]
)


@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[None], count: int) -> list[str]:
  r = await joke_generation_agent.run(  # (3)
      f'Please generate {count} jokes.',
      usage=ctx.usage,  # (4)
  )
  return r.output  # (5)


result = joke_selection_agent.run_sync(
  'Tell me a joke.',
  usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500),
)
print(result.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.
print(result.usage())
#> RunUsage(input_tokens=165, output_tokens=24, requests=3, tool_calls=1)
```

The "parent" or controlling agent.

The "delegate" agent, which is called from within a tool of the parent agent.

Call the delegate agent from within a tool of the parent agent.

Pass the usage from the parent agent to the delegate agent so the final [`result.usage()`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult.usage) includes the usage from both agents.

Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool.

_(This example is complete, it can be run "as is")_

The control flow for this example is pretty simple and can be summarised as follows:

```mermaid
graph TD
  START --> joke_selection_agent
  joke_selection_agent --> joke_factory["joke_factory (tool)"]
  joke_factory --> joke_generation_agent
  joke_generation_agent --> joke_factory
  joke_factory --> joke_selection_agent
  joke_selection_agent --> END
```

### Agent delegation and dependencies

Generally the delegate agent needs to either have the same [dependencies](/docs/ai/core-concepts/dependencies) as the calling agent, or dependencies which are a subset of the calling agent's dependencies.

Initializing dependencies

We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent.

agent\_delegation\_deps.py

```python
from dataclasses import dataclass

import httpx

from pydantic_ai import Agent, RunContext


@dataclass
class ClientAndKey:  # (1)
  http_client: httpx.AsyncClient
  api_key: str


joke_selection_agent = Agent(
  'openai:gpt-5.2',
  deps_type=ClientAndKey,  # (2)
  instructions=(
      'Use the `joke_factory` tool to generate some jokes on the given subject, '
      'then choose the best. You must return just a single joke.'
  ),
)
joke_generation_agent = Agent(
  'google-gla:gemini-3-flash-preview',
  deps_type=ClientAndKey,  # (4)
  output_type=list[str],
  instructions=(
      'Use the "get_jokes" tool to get some jokes on the given subject, '
      'then extract each joke into a list.'
  ),
)


@joke_selection_agent.tool
async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]:
  r = await joke_generation_agent.run(
      f'Please generate {count} jokes.',
      deps=ctx.deps,  # (3)
      usage=ctx.usage,
  )
  return r.output


@joke_generation_agent.tool  # (5)
async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str:
  response = await ctx.deps.http_client.get(
      'https://example.com',
      params={'count': count},
      headers={'Authorization': f'Bearer {ctx.deps.api_key}'},
  )
  response.raise_for_status()
  return response.text


async def main():
  async with httpx.AsyncClient() as client:
      deps = ClientAndKey(client, 'foobar')
      result = await joke_selection_agent.run('Tell me a joke.', deps=deps)
      print(result.output)
      #> Did you hear about the toothpaste scandal? They called it Colgate.
      print(result.usage())  # (6)
      #> RunUsage(input_tokens=220, output_tokens=32, requests=4, tool_calls=2)
```

Define a dataclass to hold the client and API key dependencies.

Set the `deps_type` of the calling agent -- `joke_selection_agent` here.

Pass the dependencies to the delegate agent's run method within the tool call.

Also set the `deps_type` of the delegate agent -- `joke_generation_agent` here.

Define a tool on the delegate agent that uses the dependencies to make an HTTP request.

Usage now includes 4 requests -- 2 from the calling agent and 2 from the delegate agent.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

This example shows how even a fairly simple agent delegation can lead to a complex control flow:

```mermaid
graph TD
  START --> joke_selection_agent
  joke_selection_agent --> joke_factory["joke_factory (tool)"]
  joke_factory --> joke_generation_agent
  joke_generation_agent --> get_jokes["get_jokes (tool)"]
  get_jokes --> http_request["HTTP request"]
  http_request --> get_jokes
  get_jokes --> joke_generation_agent
  joke_generation_agent --> joke_factory
  joke_factory --> joke_selection_agent
  joke_selection_agent --> END
```

## Programmatic agent hand-off

"Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next.

Here agents don't need to use the same deps.

Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference.

programmatic\_handoff.py

```python
from typing import Literal

from pydantic import BaseModel, Field
from rich.prompt import Prompt

from pydantic_ai import Agent, ModelMessage, RunContext, RunUsage, UsageLimits


class FlightDetails(BaseModel):
  flight_number: str


class Failed(BaseModel):
  """Unable to find a satisfactory choice."""


flight_search_agent = Agent[None, FlightDetails | Failed](  # (1)
  'openai:gpt-5.2',
  output_type=FlightDetails | Failed,  # type: ignore
  instructions=(
      'Use the "flight_search" tool to find a flight '
      'from the given origin to the given destination.'
  ),
)


@flight_search_agent.tool  # (2)
async def flight_search(
  ctx: RunContext[None], origin: str, destination: str
) -> FlightDetails | None:
  # in reality, this would call a flight search API or
  # use a browser to scrape a flight search website
  return FlightDetails(flight_number='AK456')


usage_limits = UsageLimits(request_limit=15)  # (3)


async def find_flight(usage: RunUsage) -> FlightDetails | None:  # (4)
  message_history: list[ModelMessage] | None = None
  for _ in range(3):
      prompt = Prompt.ask(
          'Where would you like to fly from and to?',
      )
      result = await flight_search_agent.run(
          prompt,
          message_history=message_history,
          usage=usage,
          usage_limits=usage_limits,
      )
      if isinstance(result.output, FlightDetails):
          return result.output
      else:
          message_history = result.all_messages(
              output_tool_return_content='Please try again.'
          )


class SeatPreference(BaseModel):
  row: int = Field(ge=1, le=30)
  seat: Literal['A', 'B', 'C', 'D', 'E', 'F']


# This agent is responsible for extracting the user's seat selection
seat_preference_agent = Agent[None, SeatPreference | Failed](  # (5)
  'openai:gpt-5.2',
  output_type=SeatPreference | Failed,  # type: ignore
  instructions=(
      "Extract the user's seat preference. "
      'Seats A and F are window seats. '
      'Row 1 is the front row and has extra leg room. '
      'Rows 14, and 20 also have extra leg room. '
  ),
)


async def find_seat(usage: RunUsage) -> SeatPreference:  # (6)
  message_history: list[ModelMessage] | None = None
  while True:
      answer = Prompt.ask('What seat would you like?')

      result = await seat_preference_agent.run(
          answer,
          message_history=message_history,
          usage=usage,
          usage_limits=usage_limits,
      )
      if isinstance(result.output, SeatPreference):
          return result.output
      else:
          print('Could not understand seat preference. Please try again.')
          message_history = result.all_messages()


async def main():  # (7)
  usage: RunUsage = RunUsage()

  opt_flight_details = await find_flight(usage)
  if opt_flight_details is not None:
      print(f'Flight found: {opt_flight_details.flight_number}')
      #> Flight found: AK456
      seat_preference = await find_seat(usage)
      print(f'Seat preference: {seat_preference}')
      #> Seat preference: row=1 seat='A'
```

Define the first agent, which finds a flight. We use an explicit type annotation until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured output](/docs/ai/core-concepts/output#structured-output). We use a union as the output type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool.

Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.

Define usage limits for the entire app.

Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight.

As with `flight_search_agent` above, we use an explicit type annotation to define the agent.

Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference.

Now that we've put our logic for running each agent into separate functions, our main app becomes very simple.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

The control flow for this example can be summarised as follows:

```mermaid
graph TB
  START --> ask_user_flight["ask user for flight"]

  subgraph find_flight
    flight_search_agent --> ask_user_flight
    ask_user_flight --> flight_search_agent
  end

  flight_search_agent --> ask_user_seat["ask user for seat"]
  flight_search_agent --> END

  subgraph find_seat
    seat_preference_agent --> ask_user_seat
    ask_user_seat --> seat_preference_agent
  end

  seat_preference_agent --> END
```

## Pydantic Graphs

See the [graph](/docs/ai/graph/) documentation on when and how to use graphs.

## Deep Agents

Deep agents are autonomous agents that combine multiple architectural patterns and capabilities to handle complex, multi-step tasks reliably. These patterns can be implemented using Pydantic AI's built-in features and (third-party) toolsets:

-   **Planning and progress tracking** -- agents break down complex tasks into steps and track their progress, giving users visibility into what the agent is working on. See [Task Management toolsets](/docs/ai/tools-toolsets/toolsets#task-management).
-   **File system operations** -- reading, writing, and editing files with proper abstraction layers that work across in-memory storage, real file systems, and sandboxed containers. See [File Operations toolsets](/docs/ai/tools-toolsets/toolsets#file-operations).
-   **Task delegation** -- spawning specialized sub-agents for specific tasks, with isolated context to prevent recursive delegation issues. See [Agent Delegation](#agent-delegation) above.
-   **Sandboxed code execution** -- running AI-generated code in isolated environments (typically Docker containers) to prevent accidents. See [Code Execution toolsets](/docs/ai/tools-toolsets/toolsets#code-execution).
-   **Context management** -- automatic conversation summarization to handle long sessions that would otherwise exceed token limits. See [Processing Message History](/docs/ai/core-concepts/message-history#processing-message-history).
-   **Human-in-the-loop** -- approval workflows for dangerous operations like code execution or file deletion. See [Requiring Tool Approval](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval).
-   **Durable execution** -- preserving agent state across transient API failures and application errors or restarts. See [Durable Execution](/docs/ai/integrations/durable_execution/overview).

In addition, the community maintains packages that bring these concepts together in a more opinionated way:

-   [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) by [Vstorm](https://vstorm.co/)

## Observing Multi-Agent Systems

Multi-agent systems can be challenging to debug due to their complexity; when multiple agents interact, understanding the flow of execution becomes essential.

### Tracing Agent Delegation

With [Logfire](/docs/ai/integrations/logfire), you can trace the entire flow across multiple agents:

```python
import logfire

logfire.configure()
logfire.instrument_pydantic_ai()

# Your multi-agent code here...
```

Logfire shows you:

-   **Which agent handled which part** of the request
-   **Delegation decisions**--when and why one agent called another
-   **End-to-end latency** broken down by agent
-   **Token usage and costs** per agent
-   **What triggered the agent run**--the HTTP request, scheduled job, or user action that started it all
-   **What happened inside tool calls**--database queries, HTTP requests, file operations, and any other instrumented code that tools execute

This is essential for understanding and optimizing complex agent workflows. When something goes wrong in a multi-agent system, you'll see exactly which agent failed and what it was trying to do, and whether the problem was in the agent's reasoning or in the backend systems it called.

### Full-Stack Visibility

If your PydanticAI application includes a TypeScript frontend, API gateway, or services in other languages, Logfire can trace them too--Logfire provides SDKs for Python, JavaScript/TypeScript, and Rust, plus compatibility with any OpenTelemetry-instrumented application. See traces from your entire stack in a unified view. For details on sending data from other languages using standard OpenTelemetry, see the [alternative clients guide](https://logfire.pydantic.dev/docs/how-to-guides/alternative-clients/).

PydanticAI's instrumentation is built on [OpenTelemetry](https://opentelemetry.io/), so you can also use any OTel-compatible backend. See the [Logfire integration guide](/docs/ai/integrations/logfire) for details.

## Examples

The following examples demonstrate how to use multi-agent patterns in Pydantic AI:

-   [Flight booking](/docs/ai/examples/flight-booking)

---

# [Testing](https://pydantic.dev/docs/ai/guides/testing/)

# Testing

Writing unit tests for Pydantic AI code is just like unit tests for any other Python code.

Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests.

Unless you're really sure you know better, you'll probably want to follow roughly this strategy:

-   Use [`pytest`](https://docs.pytest.org/en/stable/) as your test harness
-   If you find yourself typing out long assertions, use [inline-snapshot](https://15r10nk.github.io/inline-snapshot/latest/)
-   Similarly, [dirty-equals](https://dirty-equals.helpmanual.io/latest/) can be useful for comparing large data structures
-   Use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) or [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) in place of your actual model to avoid the usage, latency and variability of real LLM calls
-   Use [`Agent.override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) to replace an agent's model, dependencies, or toolsets inside your application logic
-   Set [`ALLOW_MODEL_REQUESTS=False`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS) globally to block any requests from being made to non-test models accidentally

### Unit testing with `TestModel`

The simplest and fastest way to exercise most of your application code is using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel), this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent.

`TestModel` is not magic

The "clever" (but not too clever) part of `TestModel` is that it will attempt to generate valid structured data for [function tools](/docs/ai/tools-toolsets/tools) and [output types](/docs/ai/core-concepts/output#structured-output) based on the schema of the registered tools.

There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool.

The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases. If you want something more sophisticated, use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) and write your own data generation logic.

Testing agents with built-in tools

[`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) cannot emulate provider-executed [built-in tools](/docs/ai/tools-toolsets/builtin-tools). If your production agent is configured with `builtin_tools`, override them in tests with `agent.override(model=TestModel(), builtin_tools=[])` unless the test is specifically checking that built-in tools are passed to the model.

Let's write unit tests for the following application code:

weather\_app.py

```python
import asyncio
from datetime import date

from pydantic_ai import Agent, RunContext

from fake_database import DatabaseConn  # (1)
from weather_service import WeatherService  # (2)

weather_agent = Agent(
  'openai:gpt-5.2',
  deps_type=WeatherService,
  instructions='Providing a weather forecast at the locations the user provides.',
)


@weather_agent.tool
def weather_forecast(
  ctx: RunContext[WeatherService], location: str, forecast_date: date
) -> str:
  if forecast_date < date.today():  # (3)
      return ctx.deps.get_historic_weather(location, forecast_date)
  else:
      return ctx.deps.get_forecast(location, forecast_date)


async def run_weather_forecast(  # (4)
  user_prompts: list[tuple[str, int]], conn: DatabaseConn
):
  """Run weather forecast for a list of user prompts and save."""
  async with WeatherService() as weather_service:

      async def run_forecast(prompt: str, user_id: int):
          result = await weather_agent.run(prompt, deps=weather_service)
          await conn.store_forecast(user_id, result.output)

      # run all prompts in parallel
      await asyncio.gather(
          *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts)
      )
```

`DatabaseConn` is a class that holds a database connection

`WeatherService` has methods to get weather forecasts and historic data about the weather

We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below

This function is the code we want to test, together with the agent it uses

Here we have a function that takes a list of `(user_prompt, user_id)` tuples, gets a weather forecast for each prompt, and stores the result in the database.

**We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.**

Here's how we would write tests using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel):

test\_weather\_app.py

```python
from datetime import timezone
import pytest

from dirty_equals import IsNow, IsStr

from pydantic_ai import models, capture_run_messages, RequestUsage
from pydantic_ai.models.test import TestModel
from pydantic_ai import (
  ModelResponse,
  TextPart,
  ToolCallPart,
  ToolReturnPart,
  UserPromptPart,
  ModelRequest,
)

from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent

pytestmark = pytest.mark.anyio  # (1)
models.ALLOW_MODEL_REQUESTS = False  # (2)


async def test_forecast():
  conn = DatabaseConn()
  user_id = 1
  with capture_run_messages() as messages:
      with weather_agent.override(model=TestModel()):  # (3)
          prompt = 'What will the weather be like in London on 2024-11-28?'
          await run_weather_forecast([(prompt, user_id)], conn)  # (4)

  forecast = await conn.get_forecast(user_id)
  assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}'  # (5)

  assert messages == [  # (6)
      ModelRequest(
          parts=[
              UserPromptPart(
                  content='What will the weather be like in London on 2024-11-28?',
                  timestamp=IsNow(tz=timezone.utc),  # (7)
              ),
          ],
          instructions='Providing a weather forecast at the locations the user provides.',
          timestamp=IsNow(tz=timezone.utc),
          run_id=IsStr(),
          conversation_id=IsStr(),
      ),
      ModelResponse(
          parts=[
              ToolCallPart(
                  tool_name='weather_forecast',
                  args={
                      'location': 'a',
                      'forecast_date': '2024-01-01',  # (8)
                  },
                  tool_call_id=IsStr(),
              )
          ],
          usage=RequestUsage(
              input_tokens=60,
              output_tokens=7,
          ),
          model_name='test',
          timestamp=IsNow(tz=timezone.utc),
          run_id=IsStr(),
          conversation_id=IsStr(),
      ),
      ModelRequest(
          parts=[
              ToolReturnPart(
                  tool_name='weather_forecast',
                  content='Sunny with a chance of rain',
                  tool_call_id=IsStr(),
                  timestamp=IsNow(tz=timezone.utc),
              ),
          ],
          instructions='Providing a weather forecast at the locations the user provides.',
          timestamp=IsNow(tz=timezone.utc),
          run_id=IsStr(),
          conversation_id=IsStr(),
      ),
      ModelResponse(
          parts=[
              TextPart(
                  content='{"weather_forecast":"Sunny with a chance of rain"}',
              )
          ],
          usage=RequestUsage(
              input_tokens=66,
              output_tokens=16,
          ),
          model_name='test',
          timestamp=IsNow(tz=timezone.utc),
          run_id=IsStr(),
          conversation_id=IsStr(),
      ),
  ]
```

We're using [anyio](https://anyio.readthedocs.io/en/stable/) to run async tests.

This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see [`ALLOW_MODEL_REQUESTS`](/docs/ai/api/models/base/#pydantic_ai.models.ALLOW_MODEL_REQUESTS) for more details.

We're using [`Agent.override`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.override) to replace the agent's model with [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel), the nice thing about `override` is that we can replace the model inside agent without needing access to the agent `run*` methods call site.

Now we call the function we want to test inside the `override` context manager.

But default, `TestModel` will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add [`custom_output_text='Sunny'`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel.custom_output_text) when defining `TestModel`.

So far we don't actually know which tools were called and with which values, we can use [`capture_run_messages`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.capture_run_messages) to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected.

The [`IsNow`](https://dirty-equals.helpmanual.io/latest/types/datetime/#dirty_equals.IsNow) helper allows us to use declarative asserts even with data which will contain timestamps that change over time.

`TestModel` isn't doing anything clever to extract values from the prompt, so these values are hardcoded.

### Unit testing with `FunctionModel`

The above tests are a great start, but careful readers will notice that the `WeatherService.get_forecast` is never called since `TestModel` calls `weather_forecast` with a date in the past.

To fully exercise `weather_forecast`, we need to use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to customise how the tools is called.

Here's an example of using `FunctionModel` to test the `weather_forecast` tool with custom inputs

test\_weather\_app2.py

```python
import re

import pytest

from pydantic_ai import models
from pydantic_ai import (
  ModelMessage,
  ModelResponse,
  TextPart,
  ToolCallPart,
)
from pydantic_ai.models.function import AgentInfo, FunctionModel

from fake_database import DatabaseConn
from weather_app import run_weather_forecast, weather_agent

pytestmark = pytest.mark.anyio
models.ALLOW_MODEL_REQUESTS = False


def call_weather_forecast(  # (1)
  messages: list[ModelMessage], info: AgentInfo
) -> ModelResponse:
  if len(messages) == 1:
      # first call, call the weather forecast tool
      user_prompt = messages[0].parts[-1]
      m = re.search(r'd{4}-d{2}-d{2}', user_prompt.content)
      assert m is not None
      args = {'location': 'London', 'forecast_date': m.group()}  # (2)
      return ModelResponse(parts=[ToolCallPart('weather_forecast', args)])
  else:
      # second call, return the forecast
      msg = messages[-1].parts[0]
      assert msg.part_kind == 'tool-return'
      return ModelResponse(parts=[TextPart(f'The forecast is: {msg.content}')])


async def test_forecast_future():
  conn = DatabaseConn()
  user_id = 1
  with weather_agent.override(model=FunctionModel(call_weather_forecast)):  # (3)
      prompt = 'What will the weather be like in London on 2032-01-01?'
      await run_weather_forecast([(prompt, user_id)], conn)

  forecast = await conn.get_forecast(user_id)
  assert forecast == 'The forecast is: Rainy with a chance of sun'
```

We define a function `call_weather_forecast` that will be called by `FunctionModel` in place of the LLM, this function has access to the list of [`ModelMessage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessage)s that make up the run, and [`AgentInfo`](/docs/ai/api/models/function/#pydantic_ai.models.function.AgentInfo) which contains information about the agent and the function tools and return tools.

Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location.

We use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to replace the agent's model with our custom function.

### Overriding model via pytest fixtures

If you're writing lots of tests that all require model to be overridden, you can use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override the model with [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) or [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) in a reusable way.

Here's an example of a fixture that overrides the model with `TestModel`:

test\_agent.py

```python
import pytest

from pydantic_ai.models.test import TestModel

from weather_app import weather_agent


@pytest.fixture
def override_weather_agent():
    with weather_agent.override(model=TestModel()):
        yield


async def test_forecast(override_weather_agent: None):
    ...
    # test code here
```

---

# [Web Chat UI](https://pydantic.dev/docs/ai/guides/web/)

# Web Chat UI

Pydantic AI includes a built-in web chat interface that you can use to interact with your agents through a browser.

![Web Chat UI](/docs/ai/img/web-chat-ui.png)

For CLI usage with `clai web`, see the [CLI - Web Chat UI documentation](/docs/ai/integrations/cli#web-chat-ui).

Note

The web UI is meant for local development and debugging. In production, you can use one of the [UI Event Stream integrations](/docs/ai/integrations/ui/overview) to connect your agent to a custom frontend.

## Installation

Install the `web` extra (installs Starlette and Uvicorn):

-   [pip](#tab-panel-24)
-   [uv](#tab-panel-25)

Terminal

```bash
pip install 'pydantic-ai-slim[web]'
```

Terminal

```bash
uv add 'pydantic-ai-slim[web]'
```

## Basic Usage

Create a web app from an agent instance using [`Agent.to_web()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.to_web):

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You are a helpful assistant.')

@agent.tool_plain
def get_weather(city: str) -> str:
    return f'The weather in {city} is sunny'

app = agent.to_web()
```

Run the app with any ASGI server:

Terminal

```bash
uvicorn my_module:app --host 127.0.0.1 --port 7932
```

## Configuring Models

You can specify additional models to make available in the UI. Models can be provided as a list of model names/instances or a dictionary mapping display labels to model names/instances.

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

# Model with custom configuration
anthropic_model = AnthropicModel('claude-sonnet-4-5')

agent = Agent('openai:gpt-5.2')

app = agent.to_web(
    models=['openai:gpt-5.2', anthropic_model],
)

# Or with custom display labels
app = agent.to_web(
    models={'GPT 5.2': 'openai:gpt-5.2', 'Claude': anthropic_model},
)
```

## Builtin Tool Support

You can specify a list of [builtin tools](/docs/ai/tools-toolsets/builtin-tools) that will be shown as options to the user, if the selected model supports them:

```python
from pydantic_ai import Agent
from pydantic_ai.builtin_tools import CodeExecutionTool, WebSearchTool

agent = Agent('openai:gpt-5.2')

app = agent.to_web(
    models=['anthropic:claude-sonnet-4-6'],
    builtin_tools=[CodeExecutionTool(), WebSearchTool()],
)
```

Memory Tool

The `memory` builtin tool is not supported via `to_web()` or `clai web`. If your agent needs memory, configure the [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) directly on the agent at construction time.

## Extra Instructions

You can pass extra instructions that will be included in each agent run:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

app = agent.to_web(instructions='Always respond in a friendly tone.')
```

## Reserved Routes

The web UI app uses the following routes which should not be overwritten:

-   `/` and `/{id}` - Serves the chat UI
-   `/api/chat` - Chat endpoint (POST, OPTIONS)
-   `/api/configure` - Frontend configuration (GET)
-   `/api/health` - Health check (GET)

The app cannot currently be mounted at a subpath (e.g., `/chat`) because the UI expects these routes at the root. You can add additional routes to the app, but avoid conflicts with these reserved paths.

## Custom HTML Source

By default, the web UI is fetched from a CDN and cached locally. You can provide `html_source` to override this for offline usage or enterprise environments.

For offline usage, download the html file once while you have internet access:

```python
from pydantic_ai.ui import DEFAULT_HTML_URL

print(DEFAULT_HTML_URL)  # Use this URL to download the UI HTML file
#> https://cdn.jsdelivr.net/npm/@pydantic/ai-chat-ui@1.2.0/dist/index.html
```

You can then download the file using the URL printed above:

Terminal

```bash
curl -o ~/pydantic-ai-ui.html <chat_ui_url>
```

Then use `html_source` to point to your local file or custom URL:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

# Use a local file (e.g., for offline usage)
app = agent.to_web(html_source='~/pydantic-ai-ui.html')

# Or use a custom URL (e.g., for enterprise environments)
app = agent.to_web(html_source='https://cdn.example.com/ui/index.html')
```

---

# [Code Mode](https://pydantic.dev/docs/ai/harness/code-mode/)

# Code Mode

Code mode is one of the capabilities in [**Pydantic AI Harness**](/docs/ai/harness/overview), the official capability library for Pydantic AI. The full docs live in the [harness repo](https://github.com/pydantic/pydantic-ai-harness) -- this page is a short intro.

[`CodeMode`](https://github.com/pydantic/pydantic-ai-harness/blob/main/pydantic_ai_harness/code_mode/README.md) wraps your tools into a single `run_code` tool powered by our [Monty](https://github.com/pydantic/monty) sandbox. The model writes Python that calls multiple tools with loops, conditionals, variables, and `asyncio.gather` -- all inside one tool call.

Standard tool calling requires one model round-trip per tool call. An agent that needs to fetch 10 items and process each one makes 11+ model calls -- slow, expensive, and context-heavy. Code mode collapses that into one.

## Usage

Terminal

```bash
uv add "pydantic-ai-harness[code-mode]"
```

```python
from pydantic_ai import Agent
from pydantic_ai_harness import CodeMode

agent = Agent('anthropic:claude-sonnet-4-6', capabilities=[CodeMode()])


@agent.tool_plain
def get_weather(city: str) -> dict:
    """Get current weather for a city."""
    return {'city': city, 'temp_f': 72, 'condition': 'sunny'}


@agent.tool_plain
def convert_temp(fahrenheit: float) -> float:
    """Convert Fahrenheit to Celsius."""
    return round((fahrenheit - 32) * 5 / 9, 1)


result = agent.run_sync("What's the weather in Paris and Tokyo, in Celsius?")
print(result.output)
```

The model writes code like:

```python
paris, tokyo = await asyncio.gather(
    get_weather(city='Paris'),
    get_weather(city='Tokyo'),
)
paris_c = await convert_temp(fahrenheit=paris['temp_f'])
tokyo_c = await convert_temp(fahrenheit=tokyo['temp_f'])
{'paris': paris_c, 'tokyo': tokyo_c}
```

## Full documentation

See the [Code Mode README](https://github.com/pydantic/pydantic-ai-harness/blob/main/pydantic_ai_harness/code_mode/README.md) in the harness repo for selective tool sandboxing, metadata-based selection, return value handling, REPL state, observability, sandbox restrictions, the full API, and agent spec usage.

---

# [Overview](https://pydantic.dev/docs/ai/harness/overview/)

# Overview

[**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness) is the official [capability](/docs/ai/core-concepts/capabilities) library for Pydantic AI -- the batteries for your agent.

Terminal

```bash
uv add pydantic-ai-harness
```

## What goes where?

Pydantic AI core ships the agent loop, model providers, the capabilities/hooks abstraction, and two kinds of capabilities:

-   **Capabilities that require model or framework support** -- anything backed by provider built-in tools (like [image generation](/docs/ai/core-concepts/capabilities#provider-adaptive-tools)), provider-specific APIs (like compaction via the OpenAI or Anthropic APIs), or deep agent graph integration. These go hand-in-hand with model class code and need to ship together.
-   **Capabilities that are fundamental to the agent experience** -- things nearly every agent benefits from, like [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [tool search](/docs/ai/tools-toolsets/tools-advanced#tool-search), and [thinking](/docs/ai/core-concepts/capabilities#thinking). These feel like qualities of the agent itself, not accessories.

**Pydantic AI Harness** is where everything else lives: standalone capabilities that make specific categories of agents powerful, or that are still finding their final shape. Context management, memory, guardrails, file system access, code execution, multi-agent orchestration -- these are the building blocks you pick and choose based on what your agent needs to do.

The harness is also where new capabilities _start_. It ships as a separate package so capabilities can iterate faster without the strict backward-compatibility requirements of core. As a capability stabilizes and proves itself broadly essential, it can graduate into core -- [code mode](https://github.com/pydantic/pydantic-ai-harness/tree/main/pydantic_ai_harness/code_mode) is an early candidate.

Many capabilities benefit from a "fall up" pattern: they typically start as a local implementation that works with every model, then gain provider-native support that uses the provider's built-in API when available -- auto-switching between the two. This is how [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [web fetch](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), and [image generation](/docs/ai/core-concepts/capabilities#provider-adaptive-tools) already work in core, and the same approach is coming for skills, code mode, and context compaction.

## What's in it?

See the [capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix) on the harness README for the full list of what's available and what's coming, along with community alternatives.

## Contributing capabilities

Capability contributions should go to the [harness repo](https://github.com/pydantic/pydantic-ai-harness), not to pydantic-ai. The capabilities abstraction gives contributions clear boundaries, which makes them easier to review. See [Contributing](https://github.com/pydantic/pydantic-ai-harness#contributing) for details.

## Build your own

You can also publish capabilities as standalone packages. See [Building custom capabilities](/docs/ai/core-concepts/capabilities#building-custom-capabilities) for the API and [Publishing capability packages](/docs/ai/guides/extensibility#publishing-capability-packages) for packaging guidance.

## Links

-   [GitHub](https://github.com/pydantic/pydantic-ai-harness)
-   [PyPI](https://pypi.org/project/pydantic-ai-harness/)
-   [Capability matrix](https://github.com/pydantic/pydantic-ai-harness#capability-matrix)
-   [Version policy](https://github.com/pydantic/pydantic-ai-harness#version-policy)
-   [Contributing](https://github.com/pydantic/pydantic-ai-harness#contributing)

---

# [](https://pydantic.dev/docs/ai/)


---

# [Agent2Agent (A2A)](https://pydantic.dev/docs/ai/integrations/a2a/)

# Agent2Agent (A2A)

The [Agent2Agent (A2A) Protocol](https://google.github.io/A2A/) is an open standard introduced by Google that enables communication and interoperability between AI agents, regardless of the framework or vendor they are built on.

At Pydantic, we built the [FastA2A](#fasta2a) library to make it easier to implement the A2A protocol in Python.

We also built a convenience method that expose Pydantic AI agents as A2A servers - let's have a quick look at how to use it:

agent\_to\_a2a.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be fun!')
app = agent.to_a2a()
```

_You can run the example with `uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000`_

This will expose the agent as an A2A server, and you can start sending requests to it.

See more about [exposing Pydantic AI agents as A2A servers](#pydantic-ai-agent-to-a2a-server).

## FastA2A

**FastA2A** is an agentic framework agnostic implementation of the A2A protocol in Python. The library is designed to be used with any agentic framework, and is **not exclusive to Pydantic AI**.

### Design

**FastA2A** is built on top of [Starlette](https://www.starlette.io), which means it's fully compatible with any ASGI server.

Given the nature of the A2A protocol, it's important to understand the design before using it, as a developer you'll need to provide some components:

-   [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage): to save and load tasks, as well as store context for conversations
-   [`Broker`](/docs/ai/api/fasta2a/#fasta2a.Broker): to schedule tasks
-   [`Worker`](/docs/ai/api/fasta2a/#fasta2a.Worker): to execute tasks

Let's have a look at how those components fit together:

```mermaid
flowchart TB
    Server["HTTP Server"] <--> |Sends Requests/<br>Receives Results| TM

    subgraph CC[Core Components]
        direction RL
        TM["TaskManager<br>(coordinates)"] --> |Schedules Tasks| Broker
        TM <--> Storage
        Broker["Broker<br>(queues & schedules)"] <--> Storage["Storage<br>(persistence)"]
        Broker --> |Delegates Execution| Worker
    end

    Worker["Worker<br>(implementation)"]
```

FastA2A allows you to bring your own [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage), [`Broker`](/docs/ai/api/fasta2a/#fasta2a.Broker) and [`Worker`](/docs/ai/api/fasta2a/#fasta2a.Worker).

#### Understanding Tasks and Context

In the A2A protocol:

-   **Task**: Represents one complete execution of an agent. When a client sends a message to the agent, a new task is created. The agent runs until completion (or failure), and this entire execution is considered one task. The final output is stored as a task artifact.
    
-   **Context**: Represents a conversation thread that can span multiple tasks. The A2A protocol uses a `context_id` to maintain conversation continuity:
    
    -   When a new message is sent without a `context_id`, the server generates a new one
    -   Subsequent messages can include the same `context_id` to continue the conversation
    -   All tasks sharing the same `context_id` have access to the complete message history

#### Storage Architecture

The [`Storage`](/docs/ai/api/fasta2a/#fasta2a.Storage) component serves two purposes:

1.  **Task Storage**: Stores tasks in A2A protocol format, including their status, artifacts, and message history
2.  **Context Storage**: Stores conversation context in a format optimized for the specific agent implementation

This design allows for agents to store rich internal state (e.g., tool calls, reasoning traces) as well as store task-specific A2A-formatted messages and artifacts.

For example, a Pydantic AI agent might store its complete internal message format (including tool calls and responses) in the context storage, while storing only the A2A-compliant messages in the task history.

### Installation

FastA2A is available on PyPI as [`fasta2a`](https://pypi.org/project/fasta2a/) so installation is as simple as:

-   [pip](#tab-panel-46)
-   [uv](#tab-panel-47)

Terminal

```bash
pip install fasta2a
```

Terminal

```bash
uv add fasta2a
```

The only dependencies are:

-   [starlette](https://www.starlette.io): to expose the A2A server as an [ASGI application](https://asgi.readthedocs.io/en/latest/)
-   [pydantic](https://pydantic.dev): to validate the request/response messages
-   [opentelemetry-api](https://opentelemetry-python.readthedocs.io/en/latest): to provide tracing capabilities

You can install Pydantic AI with the `a2a` extra to include **FastA2A**:

-   [pip](#tab-panel-48)
-   [uv](#tab-panel-49)

Terminal

```bash
pip install 'pydantic-ai-slim[a2a]'
```

Terminal

```bash
uv add 'pydantic-ai-slim[a2a]'
```

### Pydantic AI Agent to A2A Server

To expose a Pydantic AI agent as an A2A server, you can use the `to_a2a` method:

agent\_to\_a2a.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='Be fun!')
app = agent.to_a2a()
```

Since `app` is an ASGI application, it can be used with any ASGI server.

Terminal

```bash
uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000
```

Since the goal of `to_a2a` is to be a convenience method, it accepts the same arguments as the [`FastA2A`](/docs/ai/api/fasta2a/#fasta2a.FastA2A) constructor.

When using `to_a2a()`, Pydantic AI automatically:

-   Stores the complete conversation history (including tool calls and responses) in the context storage
-   Ensures that subsequent messages with the same `context_id` have access to the full conversation history
-   Persists agent results as A2A artifacts:
    -   String results become `TextPart` artifacts and also appear in the message history
    -   Structured data (Pydantic models, dataclasses, tuples, etc.) become `DataPart` artifacts with the data wrapped as `{"result": <your_data>}`
    -   Artifacts include metadata with type information and JSON schema when available

---

# [Command Line Interface (CLI)](https://pydantic.dev/docs/ai/integrations/cli/)

# Command Line Interface (CLI)

**Pydantic AI** comes with a CLI, `clai` (pronounced "clay"). You can use it to chat with various LLMs and quickly get answers, right from the command line, or spin up a uvicorn server to chat with your Pydantic AI agents from your browser.

## Installation

You can run the `clai` using [`uvx`](https://docs.astral.sh/uv/guides/tools/):

Terminal

```bash
uvx clai
```

Or install `clai` globally [with `uv`](https://docs.astral.sh/uv/guides/tools/#installing-tools):

Terminal

```bash
uv tool install clai
...
clai
```

Or with `pip`:

Terminal

```bash
pip install clai
...
clai
```

## CLI Usage

You'll need to set an environment variable depending on the provider you intend to use.

E.g. if you're using OpenAI, set the `OPENAI_API_KEY` environment variable:

Terminal

```bash
export OPENAI_API_KEY='your-api-key-here'
```

Then running `clai` will start an interactive session where you can chat with the AI model. Special commands available in interactive mode:

-   `/exit`: Exit the session
-   `/markdown`: Show the last response in markdown format
-   `/multiline`: Toggle multiline input mode (use Ctrl+D to submit)
-   `/cp`: Copy the last response to clipboard

### CLI Options

Option

Description

`prompt`

AI prompt for one-shot mode (positional). If omitted, starts interactive mode.

`-m`, `--model`

Model to use in `provider:model` format (e.g., `openai:gpt-5.2`)

`-a`, `--agent`

Custom agent in `module:variable` format

`-t`, `--code-theme`

Syntax highlighting theme (`dark`, `light`, or [pygments theme](https://pygments.org/styles/))

`--no-stream`

Disable streaming from the model

`-l`, `--list-models`

List all available models and exit

`--version`

Show version and exit

### Choose a model

You can specify which model to use with the `--model` flag:

Terminal

```bash
clai --model anthropic:claude-sonnet-4-6
```

(a full list of models available can be printed with `clai --list-models`)

### Custom Agents

You can specify a custom agent using the `--agent` flag with a module path and variable name:

custom\_agent.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.')
```

Then run:

Terminal

```bash
clai --agent custom_agent:agent "What's the weather today?"
```

The format must be `module:variable` where:

-   `module` is the importable Python module path
-   `variable` is the name of the Agent instance in that module

Additionally, you can directly launch CLI mode from an `Agent` instance using `Agent.to_cli_sync()`:

agent\_to\_cli\_sync.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.')
agent.to_cli_sync()
```

You can also use the async interface with `Agent.to_cli()`:

agent\_to\_cli.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.')

async def main():
    await agent.to_cli()
```

_(You'll need to add `asyncio.run(main())` to run `main`)_

### Message History

Both `Agent.to_cli()` and `Agent.to_cli_sync()` support a `message_history` parameter, allowing you to continue an existing conversation or provide conversation context:

agent\_with\_history.py

```python
from pydantic_ai import (
    Agent,
    ModelMessage,
    ModelRequest,
    ModelResponse,
    TextPart,
    UserPromptPart,
)

agent = Agent('openai:gpt-5.2')

# Create some conversation history
message_history: list[ModelMessage] = [
    ModelRequest([UserPromptPart(content='What is 2+2?')]),
    ModelResponse([TextPart(content='2+2 equals 4.')])
]

# Start CLI with existing conversation context
agent.to_cli_sync(message_history=message_history)
```

The CLI will start with the provided conversation history, allowing the agent to refer back to previous exchanges and maintain context throughout the session.

## Web Chat UI

Launch a web-based chat interface by running:

Terminal

```bash
clai web -m openai:gpt-5.2
```

This will start a web server (default: [http://127.0.0.1:7932](http://127.0.0.1:7932)) with a chat interface.

You can also serve an existing agent. For example, if you have an agent defined in `my_agent.py`:

```python
from pydantic_ai import Agent

my_agent = Agent('openai:gpt-5.2', instructions='You are a helpful assistant.')
```

Launch the web UI:

Terminal

```bash
# With a custom agent
clai web --agent my_module:my_agent

# With specific models (first is default when no --agent)
clai web -m openai:gpt-5.2 -m anthropic:claude-sonnet-4-6

# With builtin tools
clai web -m openai:gpt-5.2 -t web_search -t code_execution

# Generic agent with system instructions
clai web -m openai:gpt-5.2 -i 'You are a helpful coding assistant'

# Custom agent with extra instructions for each run
clai web --agent my_module:my_agent -i 'Always respond in Spanish'
```

Memory Tool

The [`memory`](/docs/ai/tools-toolsets/builtin-tools#memory-tool) builtin tool cannot be enabled via `-t memory`. If your agent needs memory, configure the [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) directly on the agent and provide it via `--agent`.

### Web UI Options

Option

Description

`--agent`, `-a`

Agent to serve in [`module:variable` format](#custom-agents)

`--model`, `-m`

Models to list as options in the UI (repeatable)

`--tool`, `-t`

[Builtin tool](/docs/ai/tools-toolsets/builtin-tools)s to list as options in the UI (repeatable). See [available tools](/docs/ai/guides/web#builtin-tool-support).

`--instructions`, `-i`

System instructions. When `--agent` is specified, these are additional to the agent's existing instructions.

`--host`

Host to bind server (default: 127.0.0.1)

`--port`

Port to bind server (default: 7932)

`--html-source`

URL or file path for the chat UI HTML.

When using `--agent`, the agent's configured model becomes the default. CLI models (`-m`) are additional options. Without `--agent`, the first `-m` model is the default.

The web chat UI can also be launched programmatically using [`Agent.to_web()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.to_web), see the [Web UI documentation](/docs/ai/guides/web).

Run the `web` command with `--help` to see all available options:

Terminal

```bash
clai web --help
```

---

# [DBOS](https://pydantic.dev/docs/ai/integrations/durable_execution/dbos/)

# DBOS

[DBOS](https://www.dbos.dev/) is a lightweight [durable execution](https://docs.dbos.dev/architecture) library natively integrated with Pydantic AI.

## Durable Execution

DBOS workflows make your program **durable** by checkpointing its state in a database. If your program ever fails, when it restarts all your workflows will automatically resume from the last completed step.

-   **Workflows** must be deterministic and generally cannot include I/O.
-   **Steps** may perform I/O (network, disk, API calls). If a step fails, it restarts from the beginning.

Every workflow input and step output is durably stored in the system database. When workflow execution fails, whether from crashes, network issues, or server restarts, DBOS leverages these checkpoints to recover workflows from their last completed step.

DBOS **queues** provide durable, database-backed alternatives to systems like Celery or BullMQ, supporting features such as concurrency limits, rate limits, timeouts, and prioritization. See the [DBOS docs](https://docs.dbos.dev/architecture) for details.

The diagram below shows the overall architecture of an agentic application in DBOS. DBOS runs fully in-process as a library. Functions remain normal Python functions but are checkpointed into a database (Postgres or SQLite).

```text
                    Clients
            (HTTP, RPC, Kafka, etc.)
                        |
                        v
+------------------------------------------------------+
|               Application Servers                    |
|                                                      |
|   +----------------------------------------------+   |
|   |        Pydantic AI + DBOS Libraries          |   |
|   |                                              |   |
|   |  [ Workflows (Agent Run Loop) ]              |   |
|   |  [ Steps (Tool, MCP, Model) ]                |   |
|   |  [ Queues ]   [ Cron Jobs ]   [ Messaging ]  |   |
|   +----------------------------------------------+   |
|                                                      |
+------------------------------------------------------+
                        |
                        v
+------------------------------------------------------+
|                      Database                        |
|   (Stores workflow and step state, schedules tasks)  |
+------------------------------------------------------+
```

See the [DBOS documentation](https://docs.dbos.dev/architecture) for more information.

## Durable Agent

Any agent can be wrapped in a [`DBOSAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent) to get durable execution. `DBOSAgent` automatically:,

-   Wraps `Agent.run` and `Agent.run_sync` as DBOS workflows.
-   Wraps [model requests](/docs/ai/models/overview) and [MCP communication](/docs/ai/mcp/client) as DBOS steps.

Custom tool functions and event stream handlers are **not automatically wrapped** by DBOS. If they involve non-deterministic behavior or perform I/O, you should explicitly decorate them with `@DBOS.step`.

The original agent, model, and MCP server can still be used as normal outside the DBOS workflow.

Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with the DBOS [open-source library](https://github.com/dbos-inc/dbos-transact-py):

-   [pip](#tab-panel-170)
-   [uv](#tab-panel-171)

Terminal

```bash
pip install pydantic-ai[dbos]
```

Terminal

```bash
uv add pydantic-ai[dbos]
```

Or if you're using the slim package, you can install it with the `dbos` optional group:

-   [pip](#tab-panel-172)
-   [uv](#tab-panel-173)

Terminal

```bash
pip install pydantic-ai-slim[dbos]
```

Terminal

```bash
uv add pydantic-ai-slim[dbos]
```

dbos\_agent.py

```python
from dbos import DBOS, DBOSConfig

from pydantic_ai import Agent
from pydantic_ai.durable_exec.dbos import DBOSAgent

dbos_config: DBOSConfig = {
  'name': 'pydantic_dbos_agent',
  'system_database_url': 'sqlite:///dbostest.sqlite',  # (3)
}
DBOS(config=dbos_config)

agent = Agent(
  'gpt-5.2',
  instructions="You're an expert in geography.",
  name='geography',  # (4)
)

dbos_agent = DBOSAgent(agent)  # (1)

async def main():
  DBOS.launch()
  result = await dbos_agent.run('What is the capital of Mexico?')  # (2)
  print(result.output)
  #> Mexico City (Ciudad de México, CDMX)
```

Workflows and `DBOSAgent` must be defined before `DBOS.launch()` so that recovery can correctly find all workflows.

[`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run) works like `Agent.run()`, but runs as a DBOS workflow and executes model requests, decorated tool calls, and MCP communication as DBOS steps.

This example uses SQLite. Postgres is recommended for production.

The agent's `name` is used to uniquely identify its workflows.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

Because DBOS workflows need to be defined before calling `DBOS.launch()` and the `DBOSAgent` instance automatically registers `run` and `run_sync` as workflows, it needs to be defined before calling `DBOS.launch()` as well.

For more information on how to use DBOS in Python applications, see their [Python SDK guide](https://docs.dbos.dev/python/programming-guide).

## DBOS Integration Considerations

When using DBOS with Pydantic AI agents, there are a few important considerations to ensure workflows and toolsets behave correctly.

### Agent and Toolset Requirements

Each agent instance must have a unique `name` so DBOS can correctly resume workflows after a failure or restart.

Tools and event stream handlers are not automatically wrapped by DBOS. You can decide how to integrate them:

-   Decorate with `@DBOS.step` if the function involves non-determinism or I/O.
-   Skip the decorator if durability isn't needed, so you avoid the extra DB checkpoint write.
-   If the function needs to enqueue tasks or invoke other DBOS workflows, run it inside the agent's main workflow (not as a step).

Other than that, any agent and toolset will just work!

### Agent Run Context and Dependencies

DBOS checkpoints workflow inputs/outputs and step outputs into a database using [`pickle`](https://docs.python.org/3/library/pickle.html). This means you need to make sure [dependencies](/docs/ai/core-concepts/dependencies) object provided to [`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run) or [`DBOSAgent.run_sync()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run_sync), and tool outputs can be serialized using pickle. You may also want to keep the inputs and outputs small (under ~2 MB). PostgreSQL and SQLite support up to 1 GB per field, but large objects may impact performance.

### Streaming

Because DBOS cannot stream output directly to the workflow or step call site, `Agent.run_stream()` and `Agent.run_stream_events()` are not supported when running inside of a DBOS workflow.

Instead, you can implement streaming by setting an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `DBOSAgent` instance and using [`DBOSAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent.run). The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events).

### Parallel Tool Execution

When using `DBOSAgent`, tools are executed in parallel by default to minimize latency. To guarantee deterministic replay and reliable recovery, DBOS waits for all parallel tool calls to complete before emitting events **in order**. It's equivalent to the behavior of [`with agent.parallel_tool_call_execution_mode('parallel_ordered_events')`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.parallel_tool_call_execution_mode).

If you prefer strict ordering, you can configure the agent to run tools sequentially by setting [`parallel_execution_mode='sequential'`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.DBOSAgent) when initializing the `DBOSAgent`.

## Step Configuration

You can customize DBOS step behavior, such as retries, by passing [`StepConfig`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.dbos.StepConfig) objects to the `DBOSAgent` constructor:

-   `mcp_step_config`: The DBOS step config to use for MCP server communication. No retries if omitted.
-   `model_step_config`: The DBOS step config to use for model request steps. No retries if omitted.

For custom tools, you can annotate them directly with [`@DBOS.step`](https://docs.dbos.dev/python/reference/decorators#step) or [`@DBOS.workflow`](https://docs.dbos.dev/python/reference/decorators#workflow) decorators as needed. These decorators have no effect outside DBOS workflows, so tools remain usable in non-DBOS agents.

## Step Retries

On top of the automatic retries for request failures that DBOS will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling.

When using DBOS, it's recommended to not use [HTTP Request Retries](/docs/ai/advanced-features/retries) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](/docs/ai/models/openai#custom-openai-client).

You can customize DBOS's retry policy using [step configuration](#step-configuration).

## Observability with Logfire

DBOS can be configured to generate OpenTelemetry spans for each workflow and step execution, and Pydantic AI emits spans for each agent run, model request, and tool invocation. You can send these spans to [Pydantic Logfire](/docs/ai/integrations/logfire) to get a full, end-to-end view of what's happening in your application.

For more information about DBOS logging and tracing, please see the [DBOS docs](https://docs.dbos.dev/python/tutorials/logging-and-tracing) for details.

---

# [Overview](https://pydantic.dev/docs/ai/integrations/durable_execution/overview/)

# Overview

Pydantic AI allows you to build durable agents that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. Durable agents have full support for [streaming](/docs/ai/core-concepts/agent#streaming-all-events) and [MCP](/docs/ai/mcp/client), with the added benefit of fault tolerance.

Pydantic AI officially supports four durable execution solutions:

-   [Temporal](/docs/ai/integrations/durable_execution/temporal)
-   [DBOS](/docs/ai/integrations/durable_execution/dbos)
-   [Prefect](/docs/ai/integrations/durable_execution/prefect)
-   [Restate](/docs/ai/integrations/durable_execution/restate)

These integrations are co-maintained by the Pydantic and vendor teams and only use Pydantic AI's public interface, so they also serve as a reference for integrating with other durable systems.

---

# [Prefect](https://pydantic.dev/docs/ai/integrations/durable_execution/prefect/)

# Prefect

[Prefect](https://www.prefect.io/) is a workflow orchestration framework for building resilient data pipelines in Python, natively integrated with Pydantic AI.

## Durable Execution

Prefect 3.0 brings [transactional semantics](https://www.prefect.io/blog/transactional-ml-pipelines-with-prefect-3-0) to your Python workflows, allowing you to group tasks into atomic units and define failure modes. If any part of a transaction fails, the entire transaction can be rolled back to a clean state.

-   **Flows** are the top-level entry points for your workflow. They can contain tasks and other flows.
-   **Tasks** are individual units of work that can be retried, cached, and monitored independently.

Prefect 3.0's approach to transactional orchestration makes your workflows automatically **idempotent**: rerunnable without duplication or inconsistency across any environment. Every task is executed within a transaction that governs when and where the task's result record is persisted. If the task runs again under an identical context, it will not re-execute but instead load its previous result.

The diagram below shows the overall architecture of an agentic application with Prefect. Prefect uses client-side task orchestration by default, with optional server connectivity for advanced features like scheduling and monitoring.

```text
            +---------------------+
            |   Prefect Server    |      (Monitoring,
            |      or Cloud       |       scheduling, UI,
            +---------------------+       orchestration)
                     ^
                     |
        Flow state,  |   Schedule flows,
        metadata,    |   track execution
        logs         |
                     |
+------------------------------------------------------+
|               Application Process                    |
|   +----------------------------------------------+   |
|   |              Flow (Agent.run)                |   |
|   +----------------------------------------------+   |
|          |          |                |               |
|          v          v                v               |
|   +-----------+ +------------+ +-------------+       |
|   |   Task    | |    Task    | |    Task     |       |
|   |  (Tool)   | | (MCP Tool) | | (Model API) |       |
|   +-----------+ +------------+ +-------------+       |
|         |           |                |               |
|       Cache &     Cache &          Cache &           |
|       persist     persist          persist           |
|         to           to               to             |
|         v            v                v              |
|   +----------------------------------------------+   |
|   |     Result Storage (Local FS, S3, etc.)     |    |
|   +----------------------------------------------+   |
+------------------------------------------------------+
          |           |                |
          v           v                v
      [External APIs, services, databases, etc.]
```

See the [Prefect documentation](https://docs.prefect.io/) for more information.

## Durable Agent

Any agent can be wrapped in a [`PrefectAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent) to get durable execution. `PrefectAgent` automatically:

-   Wraps `Agent.run` and `Agent.run_sync` as Prefect flows.
-   Wraps [model requests](/docs/ai/models/overview) as Prefect tasks.
-   Wraps [tool calls](/docs/ai/tools-toolsets/tools) as Prefect tasks (configurable per-tool).
-   Wraps [MCP communication](/docs/ai/mcp/client) as Prefect tasks.

Event stream handlers are **automatically wrapped** by Prefect when running inside a Prefect flow. Each event from the stream is processed in a separate Prefect task for durability. You can customize the task behavior using the `event_stream_handler_task_config` parameter when creating the `PrefectAgent`. Do **not** manually decorate event stream handlers with `@task`. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events)

The original agent, model, and MCP server can still be used as normal outside the Prefect flow.

Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with Prefect:

-   [pip](#tab-panel-174)
-   [uv](#tab-panel-175)

Terminal

```bash
pip install pydantic-ai[prefect]
```

Terminal

```bash
uv add pydantic-ai[prefect]
```

Or if you're using the slim package, you can install it with the `prefect` optional group:

-   [pip](#tab-panel-176)
-   [uv](#tab-panel-177)

Terminal

```bash
pip install pydantic-ai-slim[prefect]
```

Terminal

```bash
uv add pydantic-ai-slim[prefect]
```

prefect\_agent.py

```python
from pydantic_ai import Agent
from pydantic_ai.durable_exec.prefect import PrefectAgent

agent = Agent(
  'gpt-5.2',
  instructions="You're an expert in geography.",
  name='geography',  # (1)
)

prefect_agent = PrefectAgent(agent)  # (2)

async def main():
  result = await prefect_agent.run('What is the capital of Mexico?')  # (3)
  print(result.output)
  #> Mexico City (Ciudad de México, CDMX)
```

The agent's `name` is used to uniquely identify its flows and tasks.

Wrapping the agent with `PrefectAgent` enables durable execution for all agent runs.

[`PrefectAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent.run) works like `Agent.run()`, but runs as a Prefect flow and executes model requests, decorated tool calls, and MCP communication as Prefect tasks.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

For more information on how to use Prefect in Python applications, see their [Python documentation](https://docs.prefect.io/v3/how-to-guides/workflows/write-and-run).

## Prefect Integration Considerations

When using Prefect with Pydantic AI agents, there are a few important considerations to ensure workflows behave correctly.

### Agent Requirements

Each agent instance must have a unique `name` so Prefect can correctly identify and track its flows and tasks.

### Tool Wrapping

Agent tools are automatically wrapped as Prefect tasks, which means they benefit from:

-   **Retry logic**: Failed tool calls can be retried automatically
-   **Caching**: Tool results are cached based on their inputs
-   **Observability**: Tool execution is tracked in the Prefect UI

You can customize tool task behavior using `tool_task_config` (applies to all tools) or `tool_task_config_by_name` (per-tool configuration):

prefect\_agent\_config.py

```python
from pydantic_ai import Agent
from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig

agent = Agent('gpt-5.2', name='my_agent')

@agent.tool_plain
def fetch_data(url: str) -> str:
    # This tool will be wrapped as a Prefect task
    ...

prefect_agent = PrefectAgent(
    agent,
    tool_task_config=TaskConfig(retries=3),  # Default for all tools
    tool_task_config_by_name={
        'fetch_data': TaskConfig(timeout_seconds=10.0),  # Specific to fetch_data
        'simple_tool': None,  # Disable task wrapping for simple_tool
    },
)
```

Set a tool's config to `None` in `tool_task_config_by_name` to disable task wrapping for that specific tool.

### Streaming

When running inside a Prefect flow, `Agent.run_stream()` works but doesn't provide real-time streaming because Prefect tasks consume their entire execution before returning results. The method will execute fully and return the complete result at once.

For real-time streaming behavior inside Prefect flows, you can set an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `PrefectAgent` instance and use [`PrefectAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.PrefectAgent.run).

**Note**: Event stream handlers behave differently when running inside a Prefect flow versus outside:

-   **Outside a flow**: The handler receives events as they stream from the model
-   **Inside a flow**: Each event is wrapped as a Prefect task for durability, which may affect timing but ensures reliability

The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events).

## Task Configuration

You can customize Prefect task behavior, such as retries and timeouts, by passing [`TaskConfig`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.prefect.TaskConfig) objects to the `PrefectAgent` constructor:

-   `mcp_task_config`: Configuration for MCP server communication tasks
-   `model_task_config`: Configuration for model request tasks
-   `tool_task_config`: Default configuration for all tool calls
-   `tool_task_config_by_name`: Per-tool task configuration (overrides `tool_task_config`)
-   `event_stream_handler_task_config`: Configuration for event stream handler tasks (applies when running inside a Prefect flow)

Available `TaskConfig` options:

-   `retries`: Maximum number of retries for the task (default: `0`)
-   `retry_delay_seconds`: Delay between retries in seconds (can be a single value or list for exponential backoff, default: `1.0`)
-   `timeout_seconds`: Maximum time in seconds for the task to complete
-   `cache_policy`: Custom Prefect cache policy for the task
-   `persist_result`: Whether to persist the task result
-   `result_storage`: Prefect result storage for the task (e.g., `'s3-bucket/my-storage'` or a `WritableFileSystem` block)
-   `log_prints`: Whether to log print statements from the task (default: `False`)

Example:

prefect\_agent\_config.py

```python
from pydantic_ai import Agent
from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig

agent = Agent(
    'gpt-5.2',
    instructions="You're an expert in geography.",
    name='geography',
)

prefect_agent = PrefectAgent(
    agent,
    model_task_config=TaskConfig(
        retries=3,
        retry_delay_seconds=[1.0, 2.0, 4.0],  # Exponential backoff
        timeout_seconds=30.0,
    ),
)

async def main():
    result = await prefect_agent.run('What is the capital of France?')
    print(result.output)
    #> Paris
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### Retry Considerations

Pydantic AI and provider API clients have their own retry logic. When using Prefect, you may want to:

-   Disable [HTTP Request Retries](/docs/ai/advanced-features/retries) in Pydantic AI
-   Turn off your provider API client's retry logic (e.g., `max_retries=0` on a [custom OpenAI client](/docs/ai/models/openai#custom-openai-client))
-   Rely on Prefect's task-level retry configuration for consistency

This prevents requests from being retried multiple times at different layers.

## Caching and Idempotency

Prefect 3.0 provides built-in caching and transactional semantics. Tasks with identical inputs will not re-execute if their results are already cached, making workflows naturally idempotent and resilient to failures.

-   **Task inputs**: Messages, settings, parameters, tool arguments, and serializable dependencies

**Note**: For user dependencies to be included in cache keys, they must be serializable (e.g., Pydantic models or basic Python types). Non-serializable dependencies are automatically excluded from cache computation.

## Observability with Prefect and Logfire

Prefect provides a built-in UI for monitoring flow runs, task executions, and failures. You can:

-   View real-time flow run status
-   Debug failures with full stack traces
-   Set up alerts and notifications

To access the Prefect UI, you can either:

1.  Use [Prefect Cloud](https://www.prefect.io/cloud) (managed service)
2.  Run a local [Prefect server](https://docs.prefect.io/v3/how-to-guides/self-hosted/server-cli) with `prefect server start`

You can also use [Pydantic Logfire](/docs/ai/integrations/logfire) for detailed observability. When using both Prefect and Logfire, you'll get complementary views:

-   **Prefect**: Workflow-level orchestration, task status, and retry history
-   **Logfire**: Fine-grained tracing of agent runs, model requests, and tool invocations

When using Logfire with Prefect, you can enable distributed tracing to see spans for your Prefect runs included with your agent runs, model requests, and tool invocations.

For more information about Prefect monitoring, see the [Prefect documentation](https://docs.prefect.io/).

## Deployments and Scheduling

To deploy and schedule a `PrefectAgent`, wrap it in a Prefect flow and use the flow's [`serve()`](https://docs.prefect.io/v3/how-to-guides/deployments/create-deployments#create-a-deployment-with-serve) or [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) methods:

serve\_agent.py

```python
from prefect import flow

from pydantic_ai import Agent
from pydantic_ai.durable_exec.prefect import PrefectAgent


@flow
async def daily_report_flow(user_prompt: str):
  """Generate a daily report using the agent."""
  agent = Agent(  # (1)
      'openai:gpt-5.2',
      name='daily_report_agent',
      instructions='Generate a daily summary report.',
  )

  prefect_agent = PrefectAgent(agent)

  result = await prefect_agent.run(user_prompt)
  return result.output


# Serve the flow with a daily schedule
if __name__ == '__main__':
  daily_report_flow.serve(
      name='daily-report-deployment',
      cron='0 9 * * *',  # Run daily at 9am
      parameters={'user_prompt': "Generate today's report"},
      tags=['production', 'reports'],
  )
```

Each flow run executes in an isolated process, and all inputs and dependencies must be serializable. Because Agent instances cannot be serialized, instantiate the agent inside the flow rather than at the module level.

The `serve()` method accepts scheduling options:

-   **`cron`**: Cron schedule string (e.g., `'0 9 * * *'` for daily at 9am)
-   **`interval`**: Schedule interval in seconds or as a timedelta
-   **`rrule`**: iCalendar RRule schedule string

For production deployments with Docker, Kubernetes, or other infrastructure, use the flow's [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) method. See the [Prefect deployment documentation](https://docs.prefect.io/v3/how-to-guides/deployments/create-deploymentsy) for more information.

---

# [Restate](https://pydantic.dev/docs/ai/integrations/durable_execution/restate/)

# Restate

[Restate](https://restate.dev) is a lightweight durable execution runtime with first-class support for AI agents. The Pydantic AI integration is provided via the [Restate Python SDK](https://github.com/restatedev/sdk-python/tree/main/python/restate/ext/pydantic).

Visit the [Restate documentation](https://docs.restate.dev/ai/patterns/durable-agents) for more information.

## Durable Execution

Restate makes your agent **durable** by recording every step of its execution in a journal. If your process crashes mid-execution, Restate replays the journal, skips completed steps, and resumes from exactly where it left off.

Your agent runs in a regular HTTP handler inside a Restate **service**. The Restate Server sits in front of your application and manages orchestration, journaling, and retries. Services run like regular Docker containers or serverless functions.

A durable agent has three building blocks:

1.  The **handler**: your agent logic, exposed as an HTTP endpoint in a Restate service.
2.  **LLM calls**: persisted so responses are not re-fetched on recovery -- saving cost and time.
3.  **Tool executions**: wrapped in durable steps so side effects are not duplicated.

```text
                  Clients
              (HTTP, Kafka, etc.)
                     |
                     v
            +---------------------+
            |   Restate Server    |      (Journals execution,
            +---------------------+       retries on failure,
                     ^                    manages state)
                     |
        Journal      |   Replay on
        steps,       |   recovery,
        retries      |   schedule calls
                     v
+------------------------------------------------------+
|               Application Process                    |
|   +----------------------------------------------+   |
|   |         Restate Service Handler              |   |
|   |           (Agent Run Loop)                   |   |
|   |    [ Durable Steps (Tool, MCP, Model) ]      |   |
|   +----------------------------------------------+   |
|         |           |                |               |
+------------------------------------------------------+
          |           |                |
          v           v                v
      [External APIs, services, databases, etc.]
```

See the [Restate documentation](https://docs.restate.dev/ai/patterns/durable-agents) for more information.

## Durable Agent

Any Pydantic AI agent can be made durable by wrapping it with `RestateAgent` from the Restate SDK and running it inside a Restate service handler.

Install the Restate SDK:

-   [pip](#tab-panel-178)
-   [uv](#tab-panel-179)

Terminal

```bash
pip install pydantic-ai "restate_sdk[serde]"
```

Terminal

```bash
uv add pydantic-ai "restate_sdk[serde]"
```

Here is a complete example of a durable Pydantic AI agent with Restate:

restate\_agent.py

```python
import restate
from pydantic_ai import Agent, RunContext
from restate.ext.pydantic import RestateAgent, restate_context

weather_agent = Agent(  # (1)
  'openai:gpt-5.2',
  system_prompt='You are a helpful agent that provides weather updates.',
)


@weather_agent.tool()
async def get_weather(_run_ctx: RunContext[None], city: str) -> dict:
  """Get the current weather for a given city."""

  # Do durable tool steps using the Restate context
  async def call_weather_api(city: str) -> dict:
      return {'temperature': 23, 'description': 'Sunny and warm.'}

  return await restate_context().run_typed(  # (2)
      f'Get weather {city}', call_weather_api, city=city
  )


restate_agent = RestateAgent(weather_agent)  # (3)

agent_service = restate.Service('WeatherAgent')


@agent_service.handler()
async def run(_ctx: restate.Context, prompt: str) -> str:  # (4)
  result = await restate_agent.run(prompt)
  return result.output


app = restate.app(services=[agent_service])  # (5)

if __name__ == "__main__":  # (6)
  import hypercorn
  import asyncio
  conf = hypercorn.Config()
  conf.bind = ["0.0.0.0:9080"]
  asyncio.run(hypercorn.asyncio.serve(app, conf))
```

Define your agent and tools as you normally would with Pydantic AI.

Use `restate_context()` actions inside tools to make their execution durable. The result is persisted and retried until it succeeds. Side effects won't be duplicated on recovery.

`RestateAgent` wraps the agent so every LLM response is saved in the Restate Server and replayed during recovery.

The Restate service handler gives the agent a durable execution context and exposes it as an HTTP endpoint.

`restate.app()` creates the application that can be served.

Run the application with an ASGI server like Hypercorn.

See the [Restate agent quickstart](https://docs.restate.dev/ai-quickstart) to learn how to run the agent.

---

# [Temporal](https://pydantic.dev/docs/ai/integrations/durable_execution/temporal/)

# Temporal

[Temporal](https://temporal.io) is a popular [durable execution](https://docs.temporal.io/evaluate/understanding-temporal#durable-execution) platform that's natively supported by Pydantic AI.

## Durable Execution

In Temporal's durable execution implementation, a program that crashes or encounters an exception while interacting with a model or API will retry until it can successfully complete.

Temporal relies primarily on a replay mechanism to recover from failures. As the program makes progress, Temporal saves key inputs and decisions, allowing a re-started program to pick up right where it left off.

The key to making this work is to separate the application's repeatable (deterministic) and non-repeatable (non-deterministic) parts:

1.  Deterministic pieces, termed [**workflows**](https://docs.temporal.io/workflow-definition), execute the same way when re-run with the same inputs.
2.  Non-deterministic pieces, termed [**activities**](https://docs.temporal.io/activities), can run arbitrary code, performing I/O and any other operations.

Workflow code can run for extended periods and, if interrupted, resume exactly where it left off. Critically, workflow code generally _cannot_ include any kind of I/O, over the network, disk, etc. Activity code faces no restrictions on I/O or external interactions, but if an activity fails part-way through it is restarted from the beginning.

Note

If you are familiar with celery, it may be helpful to think of Temporal activities as similar to celery tasks, but where you wait for the task to complete and obtain its result before proceeding to the next step in the workflow. However, Temporal workflows and activities offer a great deal more flexibility and functionality than celery tasks.

See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information

In the case of Pydantic AI agents, integration with Temporal means that [model requests](/docs/ai/models/overview), [tool calls](/docs/ai/tools-toolsets/tools) that may require I/O, and [MCP server communication](/docs/ai/mcp/client) all need to be offloaded to Temporal activities due to their I/O requirements, while the logic that coordinates them (i.e. the agent run) lives in the workflow. Code that handles a scheduled job or web request can then execute the workflow, which will in turn execute the activities as needed.

The diagram below shows the overall architecture of an agentic application in Temporal. The Temporal Server is responsible for tracking program execution and making sure the associated state is preserved reliably (i.e., stored to an internal database, and possibly replicated across cloud regions). Temporal Server manages data in encrypted form, so all data processing occurs on the Worker, which runs the workflow and activities.

```text
            +---------------------+
            |   Temporal Server   |      (Stores workflow state,
            +---------------------+       schedules activities,
                     ^                    persists progress)
                     |
        Save state,  |   Schedule Tasks,
        progress,    |   load state on resume
        timeouts     |
                     |
+------------------------------------------------------+
|                      Worker                          |
|   +----------------------------------------------+   |
|   |              Workflow Code                   |   |
|   |       (Agent Run Loop)                       |   |
|   +----------------------------------------------+   |
|          |          |                |               |
|          v          v                v               |
|   +-----------+ +------------+ +-------------+       |
|   | Activity  | | Activity   | |  Activity   |       |
|   | (Tool)    | | (MCP Tool) | | (Model API) |       |
|   +-----------+ +------------+ +-------------+       |
|         |           |                |               |
+------------------------------------------------------+
          |           |                |
          v           v                v
      [External APIs, services, databases, etc.]
```

See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information.

## Durable Agent

Any agent can be wrapped in a [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent) to get a durable agent that can be used inside a deterministic Temporal workflow, by automatically offloading all work that requires I/O (namely model requests, tool calls, and MCP server communication) to non-deterministic activities.

At the time of wrapping, the agent's [model](/docs/ai/models/overview) and [toolsets](/docs/ai/tools-toolsets/toolsets) (including function tools registered on the agent and MCP servers) are frozen, activities are dynamically created for each, and the original model and toolsets are wrapped to call on the worker to execute the corresponding activities instead of directly performing the actions inside the workflow. The original agent can still be used as normal outside the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent.

Here is a simple but complete example of wrapping an agent for durable execution, creating a Temporal workflow with durable execution logic, connecting to a Temporal server, and running the workflow from non-durable code. All it requires is a Temporal server to be [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally):

Terminal

```sh
brew install temporal
temporal server start-dev
```

temporal\_agent.py

```python
import uuid

from temporalio import workflow
from temporalio.client import Client
from temporalio.worker import Worker

from pydantic_ai import Agent
from pydantic_ai.durable_exec.temporal import (
  PydanticAIPlugin,
  PydanticAIWorkflow,
  TemporalAgent,
)

agent = Agent(
  'openai:gpt-5.2',
  instructions="You're an expert in geography.",
  name='geography',  # (10)
)

temporal_agent = TemporalAgent(agent)  # (1)


@workflow.defn
class GeographyWorkflow(PydanticAIWorkflow):  # (2)
  __pydantic_ai_agents__ = [temporal_agent]  # (3)

  @workflow.run
  async def run(self, prompt: str) -> str:
      result = await temporal_agent.run(prompt)  # (4)
      return result.output


async def main():
  client = await Client.connect(  # (5)
      'localhost:7233',  # (6)
      plugins=[PydanticAIPlugin()],  # (7)
  )

  async with Worker(  # (8)
      client,
      task_queue='geography',
      workflows=[GeographyWorkflow],
  ):
      output = await client.execute_workflow(  # (10)
          GeographyWorkflow.run,
          args=['What is the capital of Mexico?'],
          id=f'geography-{uuid.uuid4()}',
          task_queue='geography',
      )
      print(output)
      #> Mexico City (Ciudad de México, CDMX)
```

The original `Agent` cannot be used inside a deterministic Temporal workflow, but the `TemporalAgent` can.

As explained above, the workflow represents a deterministic piece of code that can use non-deterministic activities for operations that require I/O. Subclassing [`PydanticAIWorkflow`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIWorkflow) is optional but provides proper typing for the `__pydantic_ai_agents__` class variable.

List the `TemporalAgent`s used by this workflow. The [`PydanticAIPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIPlugin) will automatically register their activities with the worker. Alternatively, if modifying the worker initialization is easier than the workflow class, you can use [`AgentPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.AgentPlugin) to register agents directly on the worker.

[`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) works just like `Agent.run()`, but it will automatically offload model requests, tool calls, and MCP server communication to Temporal activities.

We connect to the Temporal server which keeps track of workflow and activity execution.

This assumes the Temporal server is [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally).

The [`PydanticAIPlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.PydanticAIPlugin) tells Temporal to use Pydantic for serialization and deserialization, treats [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) exceptions as non-retryable, and automatically registers activities for agents listed in `__pydantic_ai_agents__`.

We start the worker that will listen on the specified task queue and run workflows and activities. In a real world application, this might be run in a separate service.

The agent's `name` is used to uniquely identify its activities.

We call on the server to execute the workflow on a worker that's listening on the specified task queue.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

In a real world application, the agent, workflow, and worker are typically defined separately from the code that calls for a workflow to be executed. Because Temporal workflows need to be defined at the top level of the file and the `TemporalAgent` instance is needed inside the workflow and when starting the worker (to register the activities), it needs to be defined at the top level of the file as well.

For more information on how to use Temporal in Python applications, see their [Python SDK guide](https://docs.temporal.io/develop/python).

## Temporal Integration Considerations

There are a few considerations specific to agents and toolsets when using Temporal for durable execution. These are important to understand to ensure that your agents and toolsets work correctly with Temporal's workflow and activity model.

### Agent Names and Toolset IDs

To ensure that Temporal knows what code to run when an activity fails or is interrupted and then restarted, even if your code is changed in between, each activity needs to have a name that's stable and unique.

When `TemporalAgent` dynamically creates activities for the wrapped agent's model requests and toolsets (specifically those that implement their own tool listing and calling, i.e. [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) and [`MCPServer`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer)), their names are derived from the agent's [`name`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.name) and the toolsets' [`id`s](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.id). These fields are normally optional, but are required to be set when using Temporal. They should not be changed once the durable agent has been deployed to production as this would break active workflows.

For dynamic toolsets created with the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator, the `id` parameter must be set explicitly. Note that with Temporal, `per_run_step=False` is not respected, as the toolset always needs to be created on-the-fly in the activity.

Other than that, any agent and toolset will just work!

### Agent Run Context and Dependencies

As workflows and activities run in separate processes, any values passed between them need to be serializable. As these payloads are stored in the workflow execution event history, Temporal limits their size to 2MB.

To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), and it's your responsibility to make sure that the [dependencies](/docs/ai/core-concepts/dependencies) object provided to [`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) can be serialized using Pydantic.

Specifically, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error. If you need one or more of these attributes to be available inside activities, you can create a [`TemporalRunContext`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalRunContext) subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to [`TemporalAgent`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent) as `run_context_type`.

### Streaming

Because Temporal activities cannot stream output directly to the activity call site, `Agent.run_stream()`, `Agent.run_stream_events()`, and [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) are not supported.

Instead, you can implement streaming by setting an [`event_stream_handler`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.EventStreamHandler) on the `Agent` or `TemporalAgent` instance and using [`TemporalAgent.run()`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.TemporalAgent.run) inside the workflow. The event stream handler function will receive the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](/docs/ai/core-concepts/agent#streaming-all-events).

As the streaming model request activity, workflow, and workflow execution call all take place in separate processes, passing data between them requires some care:

-   To get data from the workflow call site or workflow to the event stream handler, you can use a [dependencies object](#agent-run-context-and-dependencies).
-   To get data from the event stream handler to the workflow, workflow call site, or a frontend, you need to use an external system that the event stream handler can write to and the event consumer can read from, like a message queue. You can use the dependency object to make sure the same connection string or other unique ID is available in all the places that need it.

### Model Selection at Runtime

`Agent.run(model=...)` normally supports both model strings (like `'openai:gpt-5.2'`) and model instances. However, `TemporalAgent` does not support arbitrary model instances because they cannot be serialized for Temporal's replay mechanism.

To use model instances with `TemporalAgent`, you need to pre-register them by passing a dict of model instances to `TemporalAgent(models={...})`. You can then reference them by name or by passing the registered instance directly. If the wrapped agent doesn't have a model set, the first registered model will be used as the default.

Model strings work as expected. For scenarios where you need to customize the provider used by the model string (e.g., inject API keys from deps), you can pass a `provider_factory` to `TemporalAgent`, which is passed the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and provider name.

Here's an example showing how to pre-register and use multiple models:

multi\_model\_temporal.py

```python
from dataclasses import dataclass
from typing import Any

from temporalio import workflow

from pydantic_ai import Agent, RunContext
from pydantic_ai.durable_exec.temporal import TemporalAgent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.models.openai import OpenAIResponsesModel
from pydantic_ai.providers import Provider


@dataclass
class Deps:
    openai_api_key: str | None = None
    anthropic_api_key: str | None = None


# Create models from different providers
default_model = OpenAIResponsesModel('gpt-5.2')
fast_model = AnthropicModel('claude-sonnet-4-5')
reasoning_model = GoogleModel('gemini-3-pro-preview')


# Optional: provider factory for dynamic model configuration
def my_provider_factory(run_context: RunContext[Deps], provider_name: str) -> Provider[Any]:
    """Create providers with custom configuration based on run context."""
    if provider_name == 'openai':
        from pydantic_ai.providers.openai import OpenAIProvider

        return OpenAIProvider(api_key=run_context.deps.openai_api_key)
    elif provider_name == 'anthropic':
        from pydantic_ai.providers.anthropic import AnthropicProvider

        return AnthropicProvider(api_key=run_context.deps.anthropic_api_key)
    else:
        raise ValueError(f'Unknown provider: {provider_name}')


agent = Agent(default_model, name='multi_model_agent', deps_type=Deps)

temporal_agent = TemporalAgent(
    agent,
    models={
        'fast': fast_model,
        'reasoning': reasoning_model,
    },
    provider_factory=my_provider_factory,  # Optional
)


@workflow.defn
class MultiModelWorkflow:
    @workflow.run
    async def run(self, prompt: str, use_reasoning: bool, use_fast: bool) -> str:
        if use_reasoning:
            # Select by registered name
            result = await temporal_agent.run(prompt, model='reasoning')
        elif use_fast:
            # Or pass the registered instance directly
            result = await temporal_agent.run(prompt, model=fast_model)
        else:
            # Or pass a model string (uses provider_factory if set)
            result = await temporal_agent.run(prompt, model='openai:gpt-5-mini')
        return result.output
```

## Activity Configuration

Temporal activity configuration, like timeouts and retry policies, can be customized by passing [`temporalio.workflow.ActivityConfig`](https://python.temporal.io/temporalio.workflow.ActivityConfig.html) objects to the `TemporalAgent` constructor:

-   `activity_config`: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used.
    
-   `model_activity_config`: The Temporal activity config to use for model request activities. This is merged with the base activity config.
    
-   `toolset_activity_config`: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config.
    
-   `tool_activity_config`: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs.
    
    If a tool does not use I/O, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities.
    

## Activity Retries

On top of the automatic retries for request failures that Temporal will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling.

When using Temporal, it's recommended to not use [HTTP Request Retries](/docs/ai/advanced-features/retries) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](/docs/ai/models/openai#custom-openai-client).

You can customize Temporal's retry policy using [activity configuration](#activity-configuration).

## Observability with Logfire

Temporal generates telemetry events and metrics for each workflow and activity execution, and Pydantic AI generates events for each agent run, model request and tool call. These can be sent to [Pydantic Logfire](/docs/ai/integrations/logfire) to get a complete picture of what's happening in your application.

To use Logfire with Temporal, you need to pass a [`LogfirePlugin`](/docs/ai/api/pydantic-ai/durable_exec/#pydantic_ai.durable_exec.temporal.LogfirePlugin) object to Temporal's `Client.connect()`:

logfire\_plugin.py

```py
from temporalio.client import Client

from pydantic_ai.durable_exec.temporal import LogfirePlugin, PydanticAIPlugin


async def main():
    client = await Client.connect(
        'localhost:7233',
        plugins=[PydanticAIPlugin(), LogfirePlugin()],
    )
```

By default, the `LogfirePlugin` will instrument Temporal (including metrics) and Pydantic AI and send all data to Logfire. To customize Logfire configuration and instrumentation, you can pass a `logfire_setup` function to the `LogfirePlugin` constructor and return a custom `Logfire` instance (i.e. the result of `logfire.configure()`). To disable sending Temporal metrics to Logfire, you can pass `metrics=False` to the `LogfirePlugin` constructor.

## Known Issues

### Pandas

When `logfire.info` is used inside an activity and the `pandas` package is among your project's dependencies, you may encounter the following error which seems to be the result of an import race condition:

```
AttributeError: partially initialized module 'pandas' has no attribute '_pandas_parser_CAPI' (most likely due to a circular import)
```

To fix this, you can use the [`temporalio.workflow.unsafe.imports_passed_through()`](https://python.temporal.io/temporalio.workflow.unsafe.html#imports_passed_through) context manager to proactively import the package and not have it be reloaded in the workflow sandbox:

temporal\_activity.py

```python
from temporalio import workflow

with workflow.unsafe.imports_passed_through():
    import pandas
```

---

# [Debugging & Monitoring with Pydantic Logfire](https://pydantic.dev/docs/ai/integrations/logfire/)

# Debugging & Monitoring with Pydantic Logfire

Applications that use LLMs have some challenges that are well known and understood: LLMs are **slow**, **unreliable** and **expensive**.

These applications also have some challenges that most developers have encountered much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a prompt can completely change a model's performance, and there's no `EXPLAIN` query you can run to understand why.

Warning

From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse.

If LLMs weren't so bloody useful, we'd never touch them.

To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them.

LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard.

## Pydantic Logfire

[Pydantic Logfire](https://pydantic.dev/logfire) is an observability platform developed by the team who created and maintain Pydantic Validation and Pydantic AI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs, all using OpenTelemetry.

Pydantic Logfire is a commercial product

Logfire is a commercially supported, hosted platform with an extremely generous and perpetual [free tier](https://pydantic.dev/pricing/). You can sign up and start using Logfire in a couple of minutes. Logfire can also be self-hosted on the enterprise tier.

Pydantic AI has built-in (but optional) support for Logfire. That means if the `logfire` package is installed and configured and agent instrumentation is enabled then detailed information about agent runs is sent to Logfire. Otherwise there's virtually no overhead and nothing is sent.

Here's an example showing details of running the [Weather Agent](/docs/ai/examples/weather-agent) in Logfire:

![Weather Agent Logfire](/docs/ai/img/logfire-weather-agent.png)

A trace is generated for the agent run, and spans are emitted for each model request and tool call.

## Using Logfire

To use Logfire, you'll need a Logfire [account](https://logfire.pydantic.dev). The Logfire Python SDK is included with `pydantic-ai`:

-   [pip](#tab-panel-38)
-   [uv](#tab-panel-39)

Terminal

```bash
pip install pydantic-ai
```

Terminal

```bash
uv add pydantic-ai
```

Or if you're using the slim package, you can install it with the `logfire` optional group:

-   [pip](#tab-panel-40)
-   [uv](#tab-panel-41)

Terminal

```bash
pip install "pydantic-ai-slim[logfire]"
```

Terminal

```bash
uv add "pydantic-ai-slim[logfire]"
```

Then authenticate your local environment with Logfire:

-   [pip](#tab-panel-42)
-   [uv](#tab-panel-43)

Terminal

```bash
 logfire auth
```

Terminal

```bash
uv run logfire auth
```

And configure a project to send data to:

-   [pip](#tab-panel-44)
-   [uv](#tab-panel-45)

Terminal

```bash
 logfire projects new
```

Terminal

```bash
uv run logfire projects new
```

(Or use an existing project with `logfire projects use`)

This will write to a `.logfire` directory in the current working directory, which the Logfire SDK will use for configuration at run time.

With that, you can start using Logfire to instrument Pydantic AI code:

instrument\_pydantic\_ai.py

```python
import logfire

from pydantic_ai import Agent

logfire.configure()  # (1)
logfire.instrument_pydantic_ai()  # (2)

agent = Agent('openai:gpt-5.2', instructions='Be concise, reply with one sentence.')
result = agent.run_sync('Where does "hello world" come from?')  # (3)
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

[`logfire.configure()`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.configure) configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly.

[`logfire.instrument_pydantic_ai()`](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_pydantic_ai) enables instrumentation of Pydantic AI.

Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution

_(This example is complete, it can be run "as is")_

Which will display in Logfire thus:

![Logfire Simple Agent Run](/docs/ai/img/logfire-simple-agent.png)

The [Logfire documentation](https://logfire.pydantic.dev/docs/) has more details on how to use Logfire, including how to instrument other libraries like [HTTPX](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) and [FastAPI](https://logfire.pydantic.dev/docs/integrations/web-frameworks/fastapi/).

Since Logfire is built on [OpenTelemetry](https://opentelemetry.io/), you can use the Logfire Python SDK to send data to any OpenTelemetry collector, see [below](#using-opentelemetry).

### Debugging

To demonstrate how Logfire can let you visualise the flow of a Pydantic AI run, here's the view you get from Logfire while running the [chat app examples](/docs/ai/examples/chat-app):

### Monitoring Performance

We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor Pydantic AI runs inside Logfire itself:

![Logfire monitoring Pydantic AI](/docs/ai/img/logfire-monitoring-pydanticai.png)

### Monitoring HTTP Requests

As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/) (bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers.

To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs (except for [Bedrock](/docs/ai/models/bedrock)) use the [HTTPX](https://www.python-httpx.org/) library internally:

with\_logfire\_instrument\_httpx.py

```python
import logfire

from pydantic_ai import Agent

logfire.configure()
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)  # (1)

agent = Agent('openai:gpt-5.2')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

See the [`logfire.instrument_httpx` docs](https://logfire.pydantic.dev/docs/api/logfire/#logfire.Logfire.instrument_httpx) more details, `capture_all=True` means both headers and body are captured for both the request and response.

![Logfire with HTTPX instrumentation](/docs/ai/img/logfire-with-httpx.png)

## Using OpenTelemetry

Pydantic AI's instrumentation uses [OpenTelemetry](https://opentelemetry.io/) (OTel), which Logfire is based on.

This means you can debug and monitor Pydantic AI with any OpenTelemetry backend.

Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so while we think you'll have the best experience using the Logfire platform 😉, you should be able to use any OTel service with GenAI support.

### Logfire with an alternative OTel backend

You can use the Logfire SDK completely freely and send the data to any OpenTelemetry backend.

Here's an example of configuring the Logfire library to send data to the excellent [otel-tui](https://github.com/ymtdzzz/otel-tui) -- an open source terminal based OTel backend and viewer (no association with Pydantic Validation).

Run `otel-tui` with docker (see [the otel-tui readme](https://github.com/ymtdzzz/otel-tui) for more instructions):

Terminal

```txt
docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest
```

then run,

otel\_tui.py

```python
import os

import logfire

from pydantic_ai import Agent

os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318'  # (1)
logfire.configure(send_to_logfire=False)  # (2)
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)

agent = Agent('openai:gpt-5.2')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> Paris
```

Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`.

We [configure](https://logfire.pydantic.dev/docs/api/logfire/#logfire.configure) Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend.

Running the above code will send tracing data to `otel-tui`, which will display like this:

![otel tui simple](/docs/ai/img/otel-tui-simple.png)

Running the [weather agent](/docs/ai/examples/weather-agent) example connected to `otel-tui` shows how it can be used to visualise a more complex trace:

![otel tui weather agent](/docs/ai/img/otel-tui-weather.png)

For more information on using the Logfire SDK to send data to alternative backends, see [the Logfire documentation](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/).

### OTel without Logfire

You can also emit OpenTelemetry data from Pydantic AI without using Logfire at all.

To do this, you'll need to install and configure the OpenTelemetry packages you need. To run the following examples, use

Terminal

```txt
uv run \
  --with 'pydantic-ai-slim[openai]' \
  --with opentelemetry-sdk \
  --with opentelemetry-exporter-otlp \
  raw_otel.py
```

raw\_otel.py

```python
import os

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.trace import set_tracer_provider

from pydantic_ai import Agent

os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318'
exporter = OTLPSpanExporter()
span_processor = BatchSpanProcessor(exporter)
tracer_provider = TracerProvider()
tracer_provider.add_span_processor(span_processor)

set_tracer_provider(tracer_provider)

Agent.instrument_all()
agent = Agent('openai:gpt-5.2')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> Paris
```

### Alternative Observability backends

Because Pydantic AI uses OpenTelemetry for observability, you can easily configure it to send data to any OpenTelemetry-compatible backend, not just our observability platform [Pydantic Logfire](#pydantic-logfire).

The following providers have dedicated documentation on Pydantic AI:

-   [Langfuse](https://langfuse.com/docs/integrations/pydantic-ai)
-   [W&B Weave](https://weave-docs.wandb.ai/guides/integrations/pydantic_ai/)
-   [Arize](https://arize.com/docs/ax/observe/tracing-integrations-auto/pydantic-ai)
-   [Openlayer](https://www.openlayer.com/docs/integrations/pydantic-ai)
-   [OpenLIT](https://docs.openlit.io/latest/integrations/pydantic)
-   [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai)
-   [Patronus AI](https://docs.patronus.ai/docs/percival/pydantic)
-   [Opik](https://www.comet.com/docs/opik/tracing/integrations/pydantic-ai)
-   [mlflow](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/pydantic_ai)
-   [Agenta](https://docs.agenta.ai/observability/integrations/pydanticai)
-   [Confident AI](https://documentation.confident-ai.com/docs/llm-tracing/integrations/pydanticai)
-   [Braintrust](https://www.braintrust.dev/docs/integrations/sdk-integrations/pydantic-ai)
-   [SigNoz](https://signoz.io/docs/pydantic-ai-observability/)
-   [Laminar](https://docs.laminar.sh/tracing/integrations/pydantic-ai)
-   [Respan](https://respan.ai/docs/integrations/pydantic-ai)

## Advanced usage

### Aggregated usage attribute names

By default, both model/request spans and agent run spans use the standard `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens` attributes. Some observability backends (e.g., Datadog, New Relic, LangSmith, Opik) aggregate these attributes across all spans, which can cause double-counting since agent run spans report the sum of their child spans' usage.

To avoid this, you can enable `use_aggregated_usage_attribute_names` so that agent run spans use distinct attribute names (e.g., `gen_ai.aggregated_usage.input_tokens`, `gen_ai.aggregated_usage.output_tokens`, and `gen_ai.aggregated_usage.details.*`):

Custom namespace

The `gen_ai.aggregated_usage.*` namespace is a custom extension not part of the [OpenTelemetry Semantic Conventions for GenAI](https://opentelemetry.io/docs/specs/semconv/gen-ai/). It was introduced to work around double-counting in observability backends. If OpenTelemetry introduces an official convention for aggregated usage in the future, this namespace may be updated or deprecated.

```python
from pydantic_ai import Agent
from pydantic_ai.models.instrumented import InstrumentationSettings

Agent.instrument_all(InstrumentationSettings(use_aggregated_usage_attribute_names=True))
```

### Configuring data format

Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), specifically version 1.37.0 of the conventions. The instrumentation format can be configured using the `version` parameter of [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings).

**The default is `version=2`**, which provides a good balance between spec compliance and compatibility.

#### Version 1 (Legacy, deprecated)

Based on [OpenTelemetry semantic conventions version 1.36.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/gen-ai/README.md) or older. Messages are captured as individual events (logs) that are children of the request span. Use `event_mode='logs'` to emit events as OpenTelemetry log-based events:

instrumentation\_settings\_event\_mode.py

```python
import logfire

from pydantic_ai import Agent

logfire.configure()
logfire.instrument_pydantic_ai(version=1, event_mode='logs')
agent = Agent('openai:gpt-5.2')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

This version won't look as good in the Logfire UI and will be removed from Pydantic AI in a future release, but may be useful for backwards compatibility.

#### Version 2 (Default)

Uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes:

-   `gen_ai.system_instructions` for instructions passed to the agent
-   `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans
-   `pydantic_ai.all_messages` on agent run spans

Some span and attribute names are not fully spec-compliant for compatibility reasons. Use version 3 or 4 for better compliance.

#### Version 3

Builds on version 2 with the following improvements:

-   **Spec-compliant span names:**
    -   `agent run` becomes `invoke_agent {gen_ai.agent.name}` (with the agent name filled in)
    -   `running tool` becomes `execute_tool {gen_ai.tool.name}` (with the tool name filled in)
-   **Spec-compliant attribute names:**
    -   `tool_arguments` becomes `gen_ai.tool.call.arguments`
    -   `tool_response` becomes `gen_ai.tool.call.result`
-   **Thinking tokens support:** Captures thinking/reasoning tokens when available

#### Version 4

Builds on version 3 with improved multimodal content handling to better align with the [GenAI semantic conventions for multimodal inputs](https://opentelemetry.io/docs/specs/semconv/gen-ai/non-normative/examples-llm-calls/#multimodal-inputs-example):

**URL-based media (ImageUrl, AudioUrl, VideoUrl):**

-   Old (v1-3): `{"type": "image-url", "url": "..."}`
-   New (v4): `{"type": "uri", "modality": "image", "uri": "...", "mime_type": "..."}`

**Inline binary content (BinaryContent, FilePart):**

-   Old (v1-3): `{"type": "binary", "media_type": "...", "content": "..."}`
-   New (v4): `{"type": "blob", "modality": "image", "mime_type": "...", "content": "..."}`

Note: The `modality` field is only included for image, audio, and video content types as specified in the OTel spec. DocumentUrl and unsupported media types omit the `modality` field.

#### Version 5

Builds on version 4 with improved handling of deferred tool calls:

-   [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) and [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired) exceptions no longer record an exception event or set the span status to ERROR -- the span is left as UNSET, since deferrals are control flow, not errors.

* * *

Note that the OpenTelemetry Semantic Conventions are still experimental and are likely to change.

### Setting OpenTelemetry SDK providers

By default, the global `TracerProvider` and `LoggerProvider` are used. These are set automatically by `logfire.configure()`. They can also be set by the `set_tracer_provider` and `set_logger_provider` functions in the OpenTelemetry Python SDK. You can set custom providers with [`InstrumentationSettings`](/docs/ai/api/models/instrumented/#pydantic_ai.models.instrumented.InstrumentationSettings).

instrumentation\_settings\_providers.py

```python
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.sdk.trace import TracerProvider

from pydantic_ai import Agent, InstrumentationSettings

instrumentation_settings = InstrumentationSettings(
    tracer_provider=TracerProvider(),
    logger_provider=LoggerProvider(),
)

agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings)
# or to instrument all agents:
Agent.instrument_all(instrumentation_settings)
```

### Instrumenting a specific `Model`

instrumented\_model\_example.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.instrumented import InstrumentationSettings, InstrumentedModel

settings = InstrumentationSettings()
model = InstrumentedModel('openai:gpt-5.2', settings)
agent = Agent(model)
```

### Excluding binary content

excluding\_binary\_content.py

```python
from pydantic_ai import Agent, InstrumentationSettings

instrumentation_settings = InstrumentationSettings(include_binary_content=False)

agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings)
# or to instrument all agents:
Agent.instrument_all(instrumentation_settings)
```

### Excluding prompts and completions

For privacy and security reasons, you may want to monitor your agent's behavior and performance without exposing sensitive user data or proprietary prompts in your observability platform. Pydantic AI allows you to exclude the actual content from telemetry while preserving the structural information needed for debugging and monitoring.

When `include_content=False` is set, Pydantic AI will exclude sensitive content from telemetry, including user prompts and model completions, tool call arguments and responses, and any other message content.

excluding\_sensitive\_content.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.instrumented import InstrumentationSettings

instrumentation_settings = InstrumentationSettings(include_content=False)

agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings)
# or to instrument all agents:
Agent.instrument_all(instrumentation_settings)
```

This setting is particularly useful in production environments where compliance requirements or data sensitivity concerns make it necessary to limit what content is sent to your observability platform.

### Adding Custom Metadata

Use the agent's `metadata` parameter to attach additional data to the agent's span. When instrumentation is enabled, the computed metadata is recorded on the agent span under the `metadata` attribute. See the [usage and metadata example in the agents guide](/docs/ai/core-concepts/agent#run-metadata) for details and usage.

---

# [AG-UI](https://pydantic.dev/docs/ai/integrations/ui/ag-ui/)

# AG-UI

The [Agent-User Interaction (AG-UI) Protocol](https://docs.ag-ui.com/introduction) is an open standard introduced by the [CopilotKit](https://webflow.copilotkit.ai/blog/introducing-ag-ui-the-protocol-where-agents-meet-users) team that standardises how frontend applications communicate with AI agents, with support for streaming, frontend tools, shared state, and custom events.

Note

The AG-UI integration was originally built by the team at [Rocket Science](https://www.rocketscience.gg/) and contributed in collaboration with the Pydantic AI and CopilotKit teams. Thanks Rocket Science!

## Installation

The only dependencies are:

-   [ag-ui-protocol](https://docs.ag-ui.com/introduction): to provide the AG-UI types and encoder.
-   [starlette](https://www.starlette.io): to handle [ASGI](https://asgi.readthedocs.io/en/latest/) requests from a framework like FastAPI.

You can install Pydantic AI with the `ag-ui` extra to ensure you have all the required AG-UI dependencies:

-   [pip](#tab-panel-180)
-   [uv](#tab-panel-181)

Terminal

```bash
pip install 'pydantic-ai-slim[ag-ui]'
```

Terminal

```bash
uv add 'pydantic-ai-slim[ag-ui]'
```

To run the examples you'll also need:

-   [uvicorn](https://www.uvicorn.org/) or another ASGI compatible server

-   [pip](#tab-panel-182)
-   [uv](#tab-panel-183)

Terminal

```bash
pip install uvicorn
```

Terminal

```bash
uv add uvicorn
```

## Usage

There are three ways to run a Pydantic AI agent based on AG-UI run input with streamed AG-UI events as output, from most to least flexible. If you're using a Starlette-based web framework like FastAPI, you'll typically want to use the second method.

1.  The `AGUIAdapter.run_stream()` method, when called on an [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) instantiated with an agent and an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object, will run the agent and return a stream of AG-UI events. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`. Use this if you're using a web framework not based on Starlette (e.g. Django or Flask) or want to modify the input or output some way.
2.  The `AGUIAdapter.dispatch_request()` class method takes an agent and a Starlette request (e.g. from FastAPI) coming from an AG-UI frontend, and returns a streaming Starlette response of AG-UI events that you can return directly from your endpoint. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`, that you can vary for each request (e.g. based on the authenticated user). This is a convenience method that combines [`AGUIAdapter.from_request()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.from_request), `AGUIAdapter.run_stream()`, and `AGUIAdapter.streaming_response()`.
3.  [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) represents an ASGI application that handles every AG-UI request by running the agent. It also takes optional [`Agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) arguments including `deps`, but these will be the same for each request, with the exception of the AG-UI state that's injected as described under [state management](#state-management). This ASGI app can be [mounted](https://fastapi.tiangolo.com/advanced/sub-applications/) at a given path in an existing FastAPI app.

### Handle run input and output directly

This example uses `AGUIAdapter.run_stream()` and performs its own request parsing and response generation. This can be modified to work with any web framework.

run\_ag\_ui.py

```python
import json
from http import HTTPStatus

from fastapi import FastAPI
from fastapi.requests import Request
from fastapi.responses import Response, StreamingResponse
from pydantic import ValidationError

from pydantic_ai import Agent
from pydantic_ai.ui import SSE_CONTENT_TYPE
from pydantic_ai.ui.ag_ui import AGUIAdapter

agent = Agent('openai:gpt-5.2', instructions='Be fun!')

app = FastAPI()


@app.post('/')
async def run_agent(request: Request) -> Response:
    accept = request.headers.get('accept', SSE_CONTENT_TYPE)
    try:
        run_input = AGUIAdapter.build_run_input(await request.body())  # (1)
    except ValidationError as e:
        return Response(
            content=json.dumps(e.json()),
            media_type='application/json',
            status_code=HTTPStatus.UNPROCESSABLE_ENTITY,
        )

    adapter = AGUIAdapter(agent=agent, run_input=run_input, accept=accept)
    event_stream = adapter.run_stream() # (2)

    sse_event_stream = adapter.encode_stream(event_stream)
    return StreamingResponse(sse_event_stream, media_type=accept) # (3)
```

1.  [`AGUIAdapter.build_run_input()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.build_run_input) takes the request body as bytes and returns an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object. You can also use the [`AGUIAdapter.from_request()`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter.from_request) class method to build an adapter directly from a request.
2.  `AGUIAdapter.run_stream()` runs the agent and returns a stream of AG-UI events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), including `deps`. You can also use `AGUIAdapter.run_stream_native()` to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into AG-UI events using `AGUIAdapter.transform_stream()`.
3.  `AGUIAdapter.encode_stream()` encodes the stream of AG-UI events as strings according to the accept header value. You can also use `AGUIAdapter.streaming_response()` to generate a streaming response directly from the AG-UI event stream returned by `run_stream()`.

Since `app` is an ASGI application, it can be used with any ASGI server:

Terminal

```shell
uvicorn run_ag_ui:app
```

This will expose the agent as an AG-UI server, and your frontend can start sending requests to it.

### Handle a Starlette request

This example uses `AGUIAdapter.dispatch_request()` to directly handle a FastAPI request and return a response. Something analogous to this will work with any Starlette-based web framework.

handle\_ag\_ui\_request.py

```python
from fastapi import FastAPI
from starlette.requests import Request
from starlette.responses import Response

from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui import AGUIAdapter

agent = Agent('openai:gpt-5.2', instructions='Be fun!')

app = FastAPI()

@app.post('/')
async def run_agent(request: Request) -> Response:
    return await AGUIAdapter.dispatch_request(request, agent=agent) # (1)
```

1.  This method essentially does the same as the previous example, but it's more convenient to use when you're already using a Starlette/FastAPI app.

Since `app` is an ASGI application, it can be used with any ASGI server:

Terminal

```shell
uvicorn handle_ag_ui_request:app
```

This will expose the agent as an AG-UI server, and your frontend can start sending requests to it.

### Stand-alone ASGI app

This example uses [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) to turn the agent into a stand-alone ASGI application:

ag\_ui\_app.py

```python
from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui.app import AGUIApp

agent = Agent('openai:gpt-5.2', instructions='Be fun!')
app = AGUIApp(agent)
```

Since `app` is an ASGI application, it can be used with any ASGI server:

Terminal

```shell
uvicorn ag_ui_app:app
```

This will expose the agent as an AG-UI server, and your frontend can start sending requests to it.

## Design

The Pydantic AI AG-UI integration supports all features of the spec:

-   [Events](https://docs.ag-ui.com/concepts/events)
-   [Messages](https://docs.ag-ui.com/concepts/messages)
-   [State Management](https://docs.ag-ui.com/concepts/state)
-   [Tools](https://docs.ag-ui.com/concepts/tools)

The integration receives messages in the form of a [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object that describes the details of the requested agent run including message history, state, and available tools.

These are converted to Pydantic AI types and passed to the agent's run method. Events from the agent, including tool calls, are converted to AG-UI events and streamed back to the caller as Server-Sent Events (SSE).

A user request may require multiple round trips between client UI and Pydantic AI server, depending on the tools and events needed.

## Features

### State management

The integration provides full support for [AG-UI state management](https://docs.ag-ui.com/concepts/state), which enables real-time synchronization between agents and frontend applications.

In the example below we have document state which is shared between the UI and server using the [`StateDeps`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateDeps) [dependencies type](/docs/ai/core-concepts/dependencies) that can be used to automatically validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/js/core/types#runagentinput) using a Pydantic `BaseModel` specified as a generic parameter.

Custom dependencies type with AG-UI state

If you want to use your own dependencies type to hold AG-UI state as well as other things, it needs to implements the [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) protocol, meaning it needs to be a [dataclass](https://docs.python.org/3/library/dataclasses.html) with a non-optional `state` field. This lets Pydantic AI ensure that state is properly isolated between requests by building a new dependencies object each time.

If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method.

If AG-UI state is provided but your dependencies do not implement [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler), Pydantic AI will emit a warning and ignore the state. Use [`StateDeps`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateDeps) or a custom [`StateHandler`](/docs/ai/api/ui/base/#pydantic_ai.ui.StateHandler) implementation to receive and validate the incoming state.

ag\_ui\_state.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.ui import StateDeps
from pydantic_ai.ui.ag_ui.app import AGUIApp


class DocumentState(BaseModel):
    """State for the document being written."""

    document: str = ''


agent = Agent(
    'openai:gpt-5.2',
    instructions='Be fun!',
    deps_type=StateDeps[DocumentState],
)
app = AGUIApp(agent, deps=StateDeps(DocumentState()))
```

Since `app` is an ASGI application, it can be used with any ASGI server:

Terminal

```bash
uvicorn ag_ui_state:app --host 0.0.0.0 --port 9000
```

### Tools

AG-UI frontend tools are seamlessly provided to the Pydantic AI agent, enabling rich user experiences with frontend user interfaces.

### Events

Pydantic AI tools can send [AG-UI events](https://docs.ag-ui.com/concepts/events) simply by returning a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object with a [`BaseEvent`](https://docs.ag-ui.com/sdk/python/core/events#baseevent) (or a list of events) as `metadata`, which allows for custom events and state updates.

ag\_ui\_tool\_events.py

```python
from ag_ui.core import CustomEvent, EventType, StateSnapshotEvent
from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ToolReturn
from pydantic_ai.ui import StateDeps
from pydantic_ai.ui.ag_ui.app import AGUIApp


class DocumentState(BaseModel):
    """State for the document being written."""

    document: str = ''


agent = Agent(
    'openai:gpt-5.2',
    instructions='Be fun!',
    deps_type=StateDeps[DocumentState],
)
app = AGUIApp(agent, deps=StateDeps(DocumentState()))


@agent.tool
async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> ToolReturn:
    return ToolReturn(
        return_value='State updated',
        metadata=[
            StateSnapshotEvent(
                type=EventType.STATE_SNAPSHOT,
                snapshot=ctx.deps.state,
            ),
        ],
    )


@agent.tool_plain
async def custom_events() -> ToolReturn:
    return ToolReturn(
        return_value='Count events sent',
        metadata=[
            CustomEvent(
                type=EventType.CUSTOM,
                name='count',
                value=1,
            ),
            CustomEvent(
                type=EventType.CUSTOM,
                name='count',
                value=2,
            ),
        ]
    )
```

Since `app` is an ASGI application, it can be used with any ASGI server:

Terminal

```bash
uvicorn ag_ui_tool_events:app --host 0.0.0.0 --port 9000
```

### Trust model

AG-UI's `RunAgentInput.messages` is fully client-controlled. The [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) applies defaults to strip untrusted parts before the agent runs -- see [Trust model for client-submitted messages](/docs/ai/integrations/ui/overview#trust-model-for-client-submitted-messages) in the UI adapter overview.

### System prompts and instructions

Pydantic AI supports two ways to provide guidance to the model: [`system_prompt`](/docs/ai/core-concepts/agent#system-prompts) (stored in the message history as [`SystemPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart)s) and [`instructions`](/docs/ai/core-concepts/agent#instructions) (injected fresh on every request, never persisted). When you control the server side, `instructions` is the recommended default.

The rest of this section only matters if you use `system_prompt`. If you only use `instructions`, there's nothing to configure -- they're always applied regardless of the AG-UI message history.

For `system_prompt`, you choose who owns it with the `manage_system_prompt` parameter on [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) (and [`handle_ag_ui_request`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.handle_ag_ui_request) / [`run_ag_ui`](/docs/ai/api/pydantic-ai/ag_ui/#pydantic_ai.ag_ui.run_ag_ui)):

-   `'server'` (default): the agent's configured `system_prompt` is authoritative. Any `SystemMessage` sent by the frontend is stripped with a warning (a malicious client could otherwise inject arbitrary instructions via crafted API requests), and the agent's own system prompt is reinjected at the head of the first request via the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability.
-   `'client'`: the frontend owns the system prompt. Frontend `SystemMessage`s are preserved as-is, and the agent's configured `system_prompt` is not injected -- the caller is fully responsible for sending it on every turn if desired. To opt into fallback-to-configured behavior, add the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability to your agent.

ag\_ui\_client\_managed\_system\_prompt.py

```python
from fastapi import FastAPI
from starlette.requests import Request
from starlette.responses import Response

from pydantic_ai import Agent
from pydantic_ai.ui.ag_ui import AGUIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()


@app.post('/')
async def run_agent(request: Request) -> Response:
    return await AGUIAdapter.dispatch_request(
        request, agent=agent, manage_system_prompt='client'
    )
```

## Examples

For more examples of how to use [`AGUIApp`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.app.AGUIApp) see [`pydantic_ai_examples.ag_ui`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/ag_ui), which includes a server for use with the [AG-UI Dojo](https://docs.ag-ui.com/tutorials/debugging#the-ag-ui-dojo).

---

# [Overview](https://pydantic.dev/docs/ai/integrations/ui/overview/)

# Overview

If you're building a chat app or other interactive frontend for an AI agent, your backend will need to receive agent run input (like a chat message or complete [message history](/docs/ai/core-concepts/message-history)) from the frontend, and will need to stream the [agent's events](/docs/ai/core-concepts/agent#streaming-all-events) (like text, thinking, and tool calls) to the frontend so that the user knows what's happening in real time.

While your frontend could use Pydantic AI's [`ModelRequest`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelRequest) and [`AgentStreamEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent) directly, you'll typically want to use a UI event stream protocol that's natively supported by your frontend framework.

Pydantic AI natively supports two UI event stream protocols:

-   [Agent-User Interaction (AG-UI) Protocol](/docs/ai/integrations/ui/ag-ui)
-   [Vercel AI Data Stream Protocol](/docs/ai/integrations/ui/vercel-ai)

These integrations are implemented as subclasses of the abstract [`UIAdapter`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) class, so they also serve as a reference for integrating with other UI event stream protocols.

## Usage

The protocol-specific [`UIAdapter`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) subclass (i.e. [`AGUIAdapter`](/docs/ai/api/ui/ag_ui/#pydantic_ai.ui.ag_ui.AGUIAdapter) or [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter)) is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific [`UIEventStream`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIEventStream) subclass, but you typically won't use this directly.

If you're using a Starlette-based web framework like FastAPI, you can use the [`UIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) class method from an endpoint function to directly handle a request and return a streaming response of protocol-specific events. This is demonstrated in the next section.

If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below.

### Usage with Starlette/FastAPI

Besides the request, [`UIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) takes the agent, the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.

Note

These examples use the `VercelAIAdapter`, but the same patterns apply to all `UIAdapter` subclasses.

dispatch\_request.py

```python
from fastapi import FastAPI
from starlette.requests import Request
from starlette.responses import Response

from pydantic_ai import Agent
from pydantic_ai.ui.vercel_ai import VercelAIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()

@app.post('/chat')
async def chat(request: Request) -> Response:
    return await VercelAIAdapter.dispatch_request(request, agent=agent)
```

### Advanced Usage

If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `UIAdapter.dispatch_request()` class method shown above:

1.  The [`UIAdapter.build_run_input()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.build_run_input) class method takes the request body as bytes and returns a protocol-specific run input object, which you can then pass to the [`UIAdapter()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter) constructor along with the agent.
    -   You can also use the [`UIAdapter.from_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) class method to build an adapter directly from a Starlette/FastAPI request.
2.  The [`UIAdapter.run_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream) method runs the agent and returns a stream of protocol-specific events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents) and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional protocol-specific events.
    -   You can also use [`UIAdapter.run_stream_native()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream_native) to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into protocol-specific events using [`UIAdapter.transform_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.transform_stream).
3.  The [`UIAdapter.encode_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.encode_stream) method encodes the stream of protocol-specific events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response.
    -   You can also use [`UIAdapter.streaming_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.streaming_response) to generate a Starlette/FastAPI streaming response directly from the protocol-specific event stream returned by `run_stream()`.

Note

This example uses FastAPI, but can be modified to work with any web framework.

run\_stream.py

```python
import json
from http import HTTPStatus

from fastapi import FastAPI
from fastapi.requests import Request
from fastapi.responses import Response, StreamingResponse
from pydantic import ValidationError

from pydantic_ai import Agent
from pydantic_ai.ui import SSE_CONTENT_TYPE
from pydantic_ai.ui.vercel_ai import VercelAIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()


@app.post('/chat')
async def chat(request: Request) -> Response:
    accept = request.headers.get('accept', SSE_CONTENT_TYPE)
    try:
        run_input = VercelAIAdapter.build_run_input(await request.body())
    except ValidationError as e:
        return Response(
            content=json.dumps(e.json()),
            media_type='application/json',
            status_code=HTTPStatus.UNPROCESSABLE_ENTITY,
        )

    adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept)
    event_stream = adapter.run_stream()

    sse_event_stream = adapter.encode_stream(event_stream)
    return StreamingResponse(sse_event_stream, media_type=accept)
```

## Trust model for client-submitted messages

UI adapter endpoints aren't authentication boundaries. Both the AG-UI and Vercel AI protocols are designed around the client transmitting the full conversation history on each request, so anything in `message_history` from the protocol -- assistant messages, tool calls, file URLs, tool results -- is under the caller's control. Treat the adapter endpoint as an internal backend service, running it inside your own authenticated route handler. See the [AG-UI security considerations](https://learn.microsoft.com/en-us/agent-framework/integrations/ag-ui/security-considerations) page for more on the deployment model both protocols assume.

The adapters apply a few defaults so that the authoritative state stays on your side:

-   **System prompts** -- client-submitted [`SystemPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart)s are stripped by default and replaced with the agent's configured prompt. Control with [`UIAdapter.manage_system_prompt`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.manage_system_prompt); see each adapter's docs for details.
-   **Dangling tool calls** -- if the client-submitted history ends in a [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) with unresolved [`ToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart)s and no matching `deferred_tool_results`, the tool calls are dropped with a warning so the agent doesn't execute tool calls that the model never emitted. For human-in-the-loop resumption, pass explicit `deferred_tool_results` to the run method -- tool calls resolved by those results are kept.
-   **File URL schemes** -- only `http` and `https` are accepted by default for [`FileUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FileUrl) parts in client-submitted messages. Non-HTTP schemes like `s3://` or `gs://` are dropped, since they cause the provider to fetch the object using your server's IAM role or service account. See [`UIAdapter.allowed_file_url_schemes`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.allowed_file_url_schemes).

For stricter conversation integrity (e.g. ensuring prior assistant turns and tool returns match what the server actually produced), persist the history server-side keyed by the thread/session ID and pass it to the adapter via `message_history` -- caller-supplied history is trusted as coming from server-side persistence and isn't subject to this sanitization.

---

# [Vercel AI](https://pydantic.dev/docs/ai/integrations/ui/vercel-ai/)

# Vercel AI

Pydantic AI natively supports the [Vercel AI Data Stream Protocol](https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) to receive agent run input from, and stream events to, a frontend using [AI SDK UI](https://ai-sdk.dev/docs/ai-sdk-ui/overview) hooks like [`useChat`](https://ai-sdk.dev/docs/reference/ai-sdk-ui/use-chat). You can optionally use [AI Elements](https://ai-sdk.dev/elements) for pre-built UI components.

Note

By default, the adapter targets AI SDK v5 for backwards compatibility. To use features introduced in AI SDK v6, set `sdk_version=6` on the adapter.

## Usage

The [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter) class is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), running the agent, and then transforming Pydantic AI events into Vercel AI events. The event stream transformation is handled by the [`VercelAIEventStream`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIEventStream) class, but you typically won't use this directly.

If you're using a Starlette-based web framework like FastAPI, you can use the [`VercelAIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) class method from an endpoint function to directly handle a request and return a streaming response of Vercel AI events. This is demonstrated in the next section.

If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below.

### Usage with Starlette/FastAPI

Besides the request, [`VercelAIAdapter.dispatch_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.dispatch_request) takes the agent, the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents), and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional Vercel AI events.

dispatch\_request.py

```python
from fastapi import FastAPI
from starlette.requests import Request
from starlette.responses import Response

from pydantic_ai import Agent
from pydantic_ai.ui.vercel_ai import VercelAIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()

@app.post('/chat')
async def chat(request: Request) -> Response:
    return await VercelAIAdapter.dispatch_request(request, agent=agent)
```

### Advanced Usage

If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `VercelAIAdapter.dispatch_request()` class method shown above:

1.  The [`VercelAIAdapter.build_run_input()`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter.build_run_input) class method takes the request body as bytes and returns a Vercel AI [`RequestData`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.request_types.RequestData) run input object, which you can then pass to the [`VercelAIAdapter()`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter) constructor along with the agent.
    -   You can also use the [`VercelAIAdapter.from_request()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.from_request) class method to build an adapter directly from a Starlette/FastAPI request.
2.  The [`VercelAIAdapter.run_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream) method runs the agent and returns a stream of Vercel AI events. It supports the same optional arguments as [`Agent.run_stream_events()`](/docs/ai/core-concepts/agent#running-agents) and an optional `on_complete` callback function that receives the completed [`AgentRunResult`](/docs/ai/api/pydantic-ai/run/#pydantic_ai.run.AgentRunResult) and can optionally yield additional Vercel AI events.
    -   You can also use [`VercelAIAdapter.run_stream_native()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.run_stream_native) to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into Vercel AI events using [`VercelAIAdapter.transform_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.transform_stream).
3.  The [`VercelAIAdapter.encode_stream()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.encode_stream) method encodes the stream of Vercel AI events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response.
    -   You can also use [`VercelAIAdapter.streaming_response()`](/docs/ai/api/ui/base/#pydantic_ai.ui.UIAdapter.streaming_response) to generate a Starlette/FastAPI streaming response directly from the Vercel AI event stream returned by `run_stream()`.

Note

This example uses FastAPI, but can be modified to work with any web framework.

run\_stream.py

```python
import json
from http import HTTPStatus

from fastapi import FastAPI
from fastapi.requests import Request
from fastapi.responses import Response, StreamingResponse
from pydantic import ValidationError

from pydantic_ai import Agent
from pydantic_ai.ui import SSE_CONTENT_TYPE
from pydantic_ai.ui.vercel_ai import VercelAIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()


@app.post('/chat')
async def chat(request: Request) -> Response:
    accept = request.headers.get('accept', SSE_CONTENT_TYPE)
    try:
        run_input = VercelAIAdapter.build_run_input(await request.body())
    except ValidationError as e:
        return Response(
            content=json.dumps(e.json()),
            media_type='application/json',
            status_code=HTTPStatus.UNPROCESSABLE_ENTITY,
        )

    adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept)
    event_stream = adapter.run_stream()

    sse_event_stream = adapter.encode_stream(event_stream)
    return StreamingResponse(sse_event_stream, media_type=accept)
```

### Data Chunks

Pydantic AI tools can send [Vercel AI data stream chunks](https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) by returning a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object with a data-carrying chunk (or a list of chunks) as `metadata`. The supported chunk types are [`DataChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.DataChunk), [`SourceUrlChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.SourceUrlChunk), [`SourceDocumentChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.SourceDocumentChunk), and [`FileChunk`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.response_types.FileChunk). This is useful for attaching structured data to the frontend alongside the tool result, such as source URLs or custom data payloads.

vercel\_ai\_tool\_chunks.py

```python
from pydantic_ai import Agent, ToolReturn
from pydantic_ai.ui.vercel_ai.response_types import DataChunk, SourceUrlChunk

agent = Agent('openai:gpt-5.2')


@agent.tool_plain
async def search_docs(query: str) -> ToolReturn:
    return ToolReturn(
        return_value=f'Found 2 results for "{query}"',
        metadata=[
            SourceUrlChunk(
                source_id='doc-1',
                url='https://example.com/docs/intro',
                title='Introduction',
            ),
            DataChunk(
                type='data-search-results',
                data={'query': query, 'count': 2},
            ),
        ],
    )
```

Note

Protocol-control chunks such as `StartChunk`, `FinishChunk`, `StartStepChunk`, or `FinishStepChunk` are automatically filtered out -- only the four data-carrying chunk types listed above are forwarded to the stream and preserved in `dump_messages`.

## Trust model

Vercel AI's request `messages` array is fully client-controlled, and the protocol round-trips approval responses and tool results through the message history. The [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter) applies defaults to strip untrusted parts before the agent runs -- see [Trust model for client-submitted messages](/docs/ai/integrations/ui/overview#trust-model-for-client-submitted-messages) in the UI adapter overview.

## Tool Approval

Note

Tool approval requires AI SDK UI v6 or later on the frontend.

Pydantic AI supports human-in-the-loop tool approval workflows with AI SDK UI, allowing users to approve or deny tool executions before they run. See the [deferred tool calls documentation](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for details on setting up tools that require approval.

To enable tool approval streaming, pass `sdk_version=6` to `dispatch_request`:

```py
@app.post('/chat')
async def chat(request: Request) -> Response:
    return await VercelAIAdapter.dispatch_request(request, agent=agent, sdk_version=6)
```

When `sdk_version=6`, the adapter will:

1.  Emit `tool-approval-request` chunks when tools with `requires_approval=True` are called
2.  Automatically extract approval responses from follow-up requests
3.  Emit `tool-output-denied` chunks for rejected tools

On the frontend, AI SDK UI's [`useChat`](https://ai-sdk.dev/docs/reference/ai-sdk-ui/use-chat) hook handles the approval flow. You can use the [`Confirmation`](https://ai-sdk.dev/elements/components/confirmation) component from AI Elements for a pre-built approval UI, or build your own using the hook's `addToolApprovalResponse` function.

Tool approval responses are trusted from the request by design, matching the protocol's round-trip through `useChat`'s `addToolApprovalResponse` and the reference Next.js backend. If your application needs the approval decision tied to server-side state rather than the request, intercept [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests), persist the approval IDs server-side, and pass explicit `deferred_tool_results` when resuming.

## System prompts and instructions

Pydantic AI supports two ways to provide guidance to the model: [`system_prompt`](/docs/ai/core-concepts/agent#system-prompts) (stored in the message history as [`SystemPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.SystemPromptPart)s) and [`instructions`](/docs/ai/core-concepts/agent#instructions) (injected fresh on every request, never persisted). When you control the server side, `instructions` is the recommended default.

The rest of this section only matters if you use `system_prompt`. If you only use `instructions`, there's nothing to configure -- they're always applied regardless of the frontend message history.

For `system_prompt`, you choose who owns it with the `manage_system_prompt` parameter on [`VercelAIAdapter`](/docs/ai/api/ui/vercel_ai/#pydantic_ai.ui.vercel_ai.VercelAIAdapter):

-   `'server'` (default): the agent's configured `system_prompt` is authoritative. Any system message sent by the frontend is stripped with a warning (a malicious client could otherwise inject arbitrary instructions via crafted API requests), and the agent's own system prompt is reinjected at the head of the first request via the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability.
-   `'client'`: the frontend owns the system prompt. Frontend system messages are preserved as-is, and the agent's configured `system_prompt` is not injected -- the caller is fully responsible for sending it on every turn if desired. To opt into fallback-to-configured behavior, add the [`ReinjectSystemPrompt`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ReinjectSystemPrompt) capability to your agent.

vercel\_ai\_client\_managed\_system\_prompt.py

```python
from fastapi import FastAPI
from starlette.requests import Request
from starlette.responses import Response

from pydantic_ai import Agent
from pydantic_ai.ui.vercel_ai import VercelAIAdapter

agent = Agent('openai:gpt-5.2')

app = FastAPI()


@app.post('/chat')
async def chat(request: Request) -> Response:
    return await VercelAIAdapter.dispatch_request(
        request, agent=agent, manage_system_prompt='client'
    )
```

---

# [Client](https://pydantic.dev/docs/ai/mcp/client/)

# Client

Pydantic AI can act as an [MCP client](https://modelcontextprotocol.io/quickstart/client), connecting to MCP servers to use their tools.

## Install

You need to either install [`pydantic-ai`](/docs/ai/overview/install), or [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `mcp` optional group:

-   [pip](#tab-panel-50)
-   [uv](#tab-panel-51)

Terminal

```bash
pip install "pydantic-ai-slim[mcp]"
```

Terminal

```bash
uv add "pydantic-ai-slim[mcp]"
```

## Usage

Pydantic AI comes with three ways to connect to MCP servers:

-   [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) which connects to an MCP server using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport
-   [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE) which connects to an MCP server using the [HTTP SSE](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) transport
-   [`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) which runs the server as a subprocess and connects to it using the [stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) transport

Examples of all three are shown below.

Each MCP server instance is a [toolset](/docs/ai/tools-toolsets/toolsets) and can be registered with an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) using the `toolsets` argument.

You can use the [`async with agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__aenter__) context manager to open and close connections to all registered servers (and in the case of stdio servers, start and stop the subprocesses) around the context where they'll be used in agent runs. You can also use [`async with server`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.__aenter__) to manage the connection or subprocess of a specific server, for example if you'd like to use it with multiple agents. If you don't explicitly enter one of these context managers to set up the server, this will be done automatically when it's needed (e.g. to list the available tools or call a specific tool), but it's more efficient to do so around the entire context where you expect the servers to be used.

### Streamable HTTP Client

[`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) connects over HTTP using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport to a server.

Note

[`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) requires an MCP server to be running and accepting HTTP connections before running the agent. Running the server is not managed by Pydantic AI.

Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport.

streamable\_http\_server.py

```python
from mcp.server.fastmcp import FastMCP

app = FastMCP()

@app.tool()
def add(a: int, b: int) -> int:
    return a + b

if __name__ == '__main__':
    app.run(transport='streamable-http')
```

Then we can create the client:

mcp\_streamable\_http\_client.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStreamableHTTP

server = MCPServerStreamableHTTP('http://localhost:8000/mcp')  # (1)
agent = Agent('openai:gpt-5.2', toolsets=[server])  # (2)

async def main():
  result = await agent.run('What is 7 plus 5?')
  print(result.output)
  #> The answer is 12.
```

Define the MCP server with the URL used to connect.

Create an agent with the MCP server attached.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

**What's happening here?**

-   The model receives the prompt "What is 7 plus 5?"
-   The model decides "Oh, I've got this `add` tool, that will be a good way to answer this question"
-   The model returns a tool call
-   Pydantic AI sends the tool call to the MCP server using the Streamable HTTP transport
-   The model is called again with the return value of running the `add` tool (12)
-   The model returns the final answer

You can visualise this clearly, and even see the tool call, by adding three lines of code to instrument the example with [logfire](https://logfire.pydantic.dev/docs):

mcp\_sse\_client\_logfire.py

```python
import logfire

logfire.configure()
logfire.instrument_pydantic_ai()
```

### SSE Client

[`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE) connects over HTTP using the [HTTP + Server Sent Events transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) to a server.

Note

The SSE transport in MCP is deprecated, you should use Streamable HTTP instead.

Before creating the SSE client, we need to run a server that supports the SSE transport.

sse\_server.py

```python
from mcp.server.fastmcp import FastMCP

app = FastMCP()

@app.tool()
def add(a: int, b: int) -> int:
    return a + b

if __name__ == '__main__':
    app.run(transport='sse')
```

Then we can create the client:

mcp\_sse\_client.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

server = MCPServerSSE('http://localhost:3001/sse')  # (1)
agent = Agent('openai:gpt-5.2', toolsets=[server])  # (2)


async def main():
  result = await agent.run('What is 7 plus 5?')
  print(result.output)
  #> The answer is 12.
```

Define the MCP server with the URL used to connect.

Create an agent with the MCP server attached.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

### MCP "stdio" Server

MCP also offers [stdio transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) where the server is run as a subprocess and communicates with the client over `stdin` and `stdout`. In this case, you'd use the [`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio) class.

In this example we use a simple MCP server that provides weather tools.

mcp\_stdio\_client.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

server = MCPServerStdio('python', args=['mcp_server.py'], timeout=10)
agent = Agent('openai:gpt-5.2', toolsets=[server])


async def main():
    result = await agent.run('What is the weather in Paris?')
    print(result.output)
    #> The weather in Paris is sunny and 26 degrees Celsius.
```

## Loading MCP Servers from Configuration

Instead of creating MCP server instances individually in code, you can load multiple servers from a JSON configuration file using [`load_mcp_servers()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.load_mcp_servers).

This is particularly useful when you need to manage multiple MCP servers or want to configure servers externally without modifying code.

### Configuration Format

The configuration file should be a JSON file with an `mcpServers` object containing server definitions. Each server is identified by a unique key and contains the configuration for that server type:

mcp\_config.json

```json
{
  "mcpServers": {
    "python-runner": {
        "command": "uv",
        "args": ["run", "mcp-run-python", "stdio"]
    },
    "weather": {
      "command": "python",
      "args": ["mcp_server.py"]
    },
    "weather-api": {
      "url": "http://localhost:3001/sse"
    },
    "calculator": {
      "url": "http://localhost:8000/mcp"
    }
  }
}
```

Note

The MCP server is only inferred to be an SSE server because of the `/sse` suffix. Any other server with the "url" field will be inferred to be a Streamable HTTP server.

We made this decision given that the SSE transport is deprecated.

### Environment Variables

The configuration file supports environment variable expansion using the `${VAR}` and `${VAR:-default}` syntax, [like Claude Code](https://code.claude.com/docs/en/mcp#environment-variable-expansion-in-mcp-json). This is useful for keeping sensitive information like API keys or host names out of your configuration files:

mcp\_config\_with\_env.json

```json
{
  "mcpServers": {
    "python-runner": {
      "command": "${PYTHON_CMD:-python3}",
      "args": ["run", "${MCP_MODULE}", "stdio"],
      "env": {
        "API_KEY": "${MY_API_KEY}"
      }
    },
    "weather-api": {
      "url": "https://${SERVER_HOST:-localhost}:${SERVER_PORT:-8080}/sse"
    }
  }
}
```

When loading this configuration with [`load_mcp_servers()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.load_mcp_servers):

-   `${VAR}` references will be replaced with the corresponding environment variable values.
-   `${VAR:-default}` references will use the environment variable value if set, otherwise the default value.

Caution

If a referenced environment variable using `${VAR}` syntax is not defined, a `ValueError` will be raised. Use the `${VAR:-default}` syntax to provide a fallback value.

### Usage

mcp\_config\_loader.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import load_mcp_servers

# Load all servers from configuration file
servers = load_mcp_servers('mcp_config.json')

# Create agent with all loaded servers
agent = Agent('openai:gpt-5.2', toolsets=servers)

async def main():
    result = await agent.run('What is 7 plus 5?')
    print(result.output)
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## Tool call customization

The MCP servers provide the ability to set a `process_tool_call` which allows the customization of tool call requests and their responses.

A common use case for this is to inject metadata to the requests which the server call needs:

mcp\_process\_tool\_call.py

```python
from typing import Any

from pydantic_ai import Agent, RunContext
from pydantic_ai.mcp import CallToolFunc, MCPServerStdio, ToolResult
from pydantic_ai.models.test import TestModel


async def process_tool_call(
    ctx: RunContext[int],
    call_tool: CallToolFunc,
    name: str,
    tool_args: dict[str, Any],
) -> ToolResult:
    """A tool call processor that passes along the deps."""
    return await call_tool(name, tool_args, {'deps': ctx.deps})


server = MCPServerStdio('python', args=['mcp_server.py'], process_tool_call=process_tool_call)
agent = Agent(
    model=TestModel(call_tools=['echo_deps']),
    deps_type=int,
    toolsets=[server]
)


async def main():
    result = await agent.run('Echo with deps set to 42', deps=42)
    print(result.output)
    #> {"echo_deps":{"echo":"This is an echo message","deps":42}}
```

How to access the metadata is MCP server SDK specific. For example with the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk), it is accessible via the [`ctx: Context`](https://github.com/modelcontextprotocol/python-sdk#context) argument that can be included on tool call handlers:

mcp\_server.py

```python
from typing import Any

from mcp.server.fastmcp import Context, FastMCP
from mcp.server.session import ServerSession

mcp = FastMCP('Pydantic AI MCP Server')
log_level = 'unset'


@mcp.tool()
async def echo_deps(ctx: Context[ServerSession, None]) -> dict[str, Any]:
    """Echo the run context.

    Args:
        ctx: Context object containing request and session information.

    Returns:
        Dictionary with an echo message and the deps.
    """
    await ctx.info('This is an info message')

    deps: Any = getattr(ctx.request_context.meta, 'deps')
    return {'echo': 'This is an echo message', 'deps': deps}

if __name__ == '__main__':
    mcp.run()
```

## Using Tool Prefixes to Avoid Naming Conflicts

When connecting to multiple MCP servers that might provide tools with the same name, you can use the `tool_prefix` parameter to avoid naming conflicts. This parameter adds a prefix to all tool names from a specific server.

This allows you to use multiple servers that might have overlapping tool names without conflicts:

mcp\_tool\_prefix\_http\_client.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

# Create two servers with different prefixes
weather_server = MCPServerSSE(
    'http://localhost:3001/sse',
    tool_prefix='weather'  # Tools will be prefixed with 'weather_'
)

calculator_server = MCPServerSSE(
    'http://localhost:3002/sse',
    tool_prefix='calc'  # Tools will be prefixed with 'calc_'
)

# Both servers might have a tool named 'get_data', but they'll be exposed as:
# - 'weather_get_data'
# - 'calc_get_data'
agent = Agent('openai:gpt-5.2', toolsets=[weather_server, calculator_server])
```

## Server Instructions

MCP servers can provide instructions during initialization that give context about how to best interact with the server's tools. These are accessible via the [`instructions`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.instructions) property after the connection is established, and can be automatically injected into the agent's instructions by setting `include_instructions=True` when creating the server.

mcp\_server\_include\_instructions.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStreamableHTTP

server = MCPServerStreamableHTTP('http://localhost:8000/mcp', include_instructions=True)
agent = Agent('openai:gpt-5.2', toolsets=[server])
```

## Tool metadata

MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset). The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) object that's passed to filter functions.

## Resources

MCP servers can provide [resources](https://modelcontextprotocol.io/docs/concepts/resources) - files, data, or content that can be accessed by the client. Resources in MCP are application-driven, with host applications determining how to incorporate context manually, based on their needs. This means they will _not_ be exposed to the LLM automatically (unless a tool returns a `ResourceLink` or `EmbeddedResource`).

Pydantic AI provides methods to discover and read resources from MCP servers:

-   [`list_resources()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.list_resources) - List all available resources on the server
-   [`list_resource_templates()`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.list_resource_templates) - List resource templates with parameter placeholders
-   [`read_resource(uri)`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.read_resource) - Read the contents of a specific resource by URI

Resources are automatically converted: text content is returned as `str`, and binary content is returned as [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent).

Before consuming resources, we need to run a server that exposes some:

mcp\_resource\_server.py

```python
from mcp.server.fastmcp import FastMCP

mcp = FastMCP('Pydantic AI MCP Server')
log_level = 'unset'


@mcp.resource('resource://user_name.txt', mime_type='text/plain')
async def user_name_resource() -> str:
    return 'Alice'


if __name__ == '__main__':
    mcp.run()
```

Then we can create the client:

mcp\_resources.py

```python
import asyncio

from pydantic_ai.mcp import MCPServerStdio


async def main():
    server = MCPServerStdio('python', args=['-m', 'mcp_resource_server'])

    async with server:
        # List all available resources
        resources = await server.list_resources()
        for resource in resources:
            print(f' - {resource.name}: {resource.uri} ({resource.mime_type})')
            #>  - user_name_resource: resource://user_name.txt (text/plain)

        # Read a text resource
        user_name = await server.read_resource('resource://user_name.txt')
        print(f'Text content: {user_name}')
        #> Text content: Alice


if __name__ == '__main__':
    asyncio.run(main())
```

_(This example is complete, it can be run "as is")_

## Custom TLS / SSL configuration

In some environments you need to tweak how HTTPS connections are established - for example to trust an internal Certificate Authority, present a client certificate for **mTLS**, or (during local development only!) disable certificate verification altogether. All HTTP-based MCP client classes ([`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP) and [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE)) expose an `http_client` parameter that lets you pass your own pre-configured [`httpx.AsyncClient`](https://www.python-httpx.org/async/).

mcp\_custom\_tls\_client.py

```python
import ssl

import httpx

from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerSSE

# Trust an internal / self-signed CA
ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem')

# OPTIONAL: if the server requires **mutual TLS** load your client certificate
ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',)

http_client = httpx.AsyncClient(
  verify=ssl_ctx,
  timeout=httpx.Timeout(10.0),
)

server = MCPServerSSE(
  'http://localhost:3001/sse',
  http_client=http_client,  # (1)
)
agent = Agent('openai:gpt-5.2', toolsets=[server])

async def main():
  result = await agent.run('How many days between 2000-01-01 and 2025-03-18?')
  print(result.output)
  #> There are 9,208 days between January 1, 2000, and March 18, 2025.
```

When you supply `http_client`, Pydantic AI reuses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic.

## Client Identification

When connecting to an MCP server, you can optionally specify an [Implementation](https://modelcontextprotocol.io/specification/2025-11-25/schema#implementation) object as client information that will be sent to the server during initialization. This is useful for:

-   Identifying your application in server logs
-   Allowing servers to provide custom behavior based on the client
-   Debugging and monitoring MCP connections
-   Version-specific feature negotiation

All MCP client classes ([`MCPServerStdio`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStdio), [`MCPServerStreamableHTTP`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerStreamableHTTP), and [`MCPServerSSE`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServerSSE)) support the `client_info` parameter:

mcp\_client\_with\_name.py

```python
from mcp import types as mcp_types

from pydantic_ai.mcp import MCPServerSSE

server = MCPServerSSE(
    'http://localhost:3001/sse',
    client_info=mcp_types.Implementation(
        name='MyApplication',
        version='2.1.0',
    ),
)
```

## MCP Sampling

What is MCP Sampling?

In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used.

Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls.

Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain.

Sampling Diagram

Here's a mermaid diagram that may or may not make the data flow clearer:

```mermaid
sequenceDiagram
    participant LLM
    participant MCP_Client as MCP client
    participant MCP_Server as MCP server

    MCP_Client->>LLM: LLM call
    LLM->>MCP_Client: LLM tool call response

    MCP_Client->>MCP_Server: tool call
    MCP_Server->>MCP_Client: sampling "create message"

    MCP_Client->>LLM: LLM call
    LLM->>MCP_Client: LLM text response

    MCP_Client->>MCP_Server: sampling response
    MCP_Server->>MCP_Client: tool call response
```

Pydantic AI supports sampling as both a client and server. See the [server](/docs/ai/mcp/server#mcp-sampling) documentation for details on how to use sampling within a server.

Sampling is automatically supported by Pydantic AI agents when they act as a client.

To be able to use sampling, an MCP server instance needs to have a [`sampling_model`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.sampling_model) set. This can be done either directly on the server using the constructor keyword argument or the property, or by using [`agent.set_mcp_sampling_model()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.set_mcp_sampling_model) to set the agent's model or one specified as an argument as the sampling model on all MCP servers registered with that agent.

Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments).

Sampling MCP Server

generate\_svg.py

```python
import re
from pathlib import Path

from mcp import SamplingMessage
from mcp.server.fastmcp import Context, FastMCP
from mcp.types import TextContent

app = FastMCP()


@app.tool()
async def image_generator(ctx: Context, subject: str, style: str) -> str:
    prompt = f'{subject=} {style=}'
    # `ctx.session.create_message` is the sampling call
    result = await ctx.session.create_message(
        [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))],
        max_tokens=1_024,
        system_prompt='Generate an SVG image as per the user input',
    )
    assert isinstance(result.content, TextContent)

    path = Path(f'{subject}_{style}.svg')
    # remove triple backticks if the svg was returned within markdown
    if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M):
        path.write_text(m.group(1), encoding='utf-8')
    else:
        path.write_text(result.content.text, encoding='utf-8')
    return f'See {path}'


if __name__ == '__main__':
    # run the server via stdio
    app.run()
```

Using this server with an `Agent` will automatically allow sampling:

sampling\_mcp\_client.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

server = MCPServerStdio('python', args=['generate_svg.py'])
agent = Agent('openai:gpt-5.2', toolsets=[server])


async def main():
    agent.set_mcp_sampling_model()
    result = await agent.run('Create an image of a robot in a punk style.')
    print(result.output)
    #> Image file written to robot_punk.svg.
```

_(This example is complete, it can be run "as is")_

You can disallow sampling by setting [`allow_sampling=False`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.allow_sampling) when creating the server reference, e.g.:

sampling\_disallowed.py

```python
from pydantic_ai.mcp import MCPServerStdio

server = MCPServerStdio(
    'python',
    args=['generate_svg.py'],
    allow_sampling=False,
)
```

## Elicitation

In MCP, [elicitation](https://modelcontextprotocol.io/docs/concepts/elicitation) allows a server to request for [structured input](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) from the client for missing or additional context during a session.

Elicitation let models essentially say "Hold on - I need to know X before i can continue" rather than requiring everything upfront or taking a shot in the dark.

### How Elicitation works

Elicitation introduces a new protocol message type called [`ElicitRequest`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitrequest), which is sent from the server to the client when it needs additional information. The client can then respond with an [`ElicitResult`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitresult) or an `ErrorData` message.

Here's a typical interaction:

-   User makes a request to the MCP server (e.g. "Book a table at that Italian place")
-   The server identifies that it needs more information (e.g. "Which Italian place?", "What date and time?")
-   The server sends an `ElicitRequest` to the client asking for the missing information.
-   The client receives the request, presents it to the user (e.g. via a terminal prompt, GUI dialog, or web interface).
-   User provides the requested information, `decline` or `cancel` the request.
-   The client sends an `ElicitResult` back to the server with the user's response.
-   With the structured data, the server can continue processing the original request.

This allows for a more interactive and user-friendly experience, especially for multi-staged workflows. Instead of requiring all information upfront, the server can ask for it as needed, making the interaction feel more natural.

### Setting up Elicitation

To enable elicitation, provide an [`elicitation_callback`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer.elicitation_callback) function when creating your MCP server instance:

restaurant\_server.py

```python
from mcp.server.fastmcp import Context, FastMCP
from pydantic import BaseModel, Field

mcp = FastMCP(name='Restaurant Booking')


class BookingDetails(BaseModel):
    """Schema for restaurant booking information."""

    restaurant: str = Field(description='Choose a restaurant')
    party_size: int = Field(description='Number of people', ge=1, le=8)
    date: str = Field(description='Reservation date (DD-MM-YYYY)')


@mcp.tool()
async def book_table(ctx: Context) -> str:
    """Book a restaurant table with user input."""
    # Ask user for booking details using Pydantic schema
    result = await ctx.elicit(message='Please provide your booking details:', schema=BookingDetails)

    if result.action == 'accept' and result.data:
        booking = result.data
        return f'✅ Booked table for {booking.party_size} at {booking.restaurant} on {booking.date}'
    elif result.action == 'decline':
        return 'No problem! Maybe another time.'
    else:  # cancel
        return 'Booking cancelled.'


if __name__ == '__main__':
    mcp.run(transport='stdio')
```

This server demonstrates elicitation by requesting structured booking details from the client when the `book_table` tool is called. Here's how to create a client that handles these elicitation requests:

client\_example.py

```python
import asyncio
from typing import Any

from mcp.client.session import ClientSession
from mcp.shared.context import RequestContext
from mcp.types import ElicitRequestParams, ElicitResult

from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio


async def handle_elicitation(
    context: RequestContext[ClientSession, Any, Any],
    params: ElicitRequestParams,
) -> ElicitResult:
    """Handle elicitation requests from MCP server."""
    print(f'\n{params.message}')

    if not params.requestedSchema:
        response = input('Response: ')
        return ElicitResult(action='accept', content={'response': response})

    # Collect data for each field
    properties = params.requestedSchema['properties']
    data = {}

    for field, info in properties.items():
        description = info.get('description', field)

        value = input(f'{description}: ')

        # Convert to proper type based on JSON schema
        if info.get('type') == 'integer':
            data[field] = int(value)
        else:
            data[field] = value

    # Confirm
    confirm = input('\nConfirm booking? (y/n/c): ').lower()

    if confirm == 'y':
        print('Booking details:', data)
        return ElicitResult(action='accept', content=data)
    elif confirm == 'n':
        return ElicitResult(action='decline')
    else:
        return ElicitResult(action='cancel')


# Set up MCP server connection
restaurant_server = MCPServerStdio(
    'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation
)

# Create agent
agent = Agent('openai:gpt-5.2', toolsets=[restaurant_server])


async def main():
    """Run the agent to book a restaurant table."""
    result = await agent.run('Book me a table')
    print(f'\nResult: {result.output}')


if __name__ == '__main__':
    asyncio.run(main())
```

### Supported Schema Types

MCP elicitation supports string, number, boolean, and enum types with flat object structures only. These limitations ensure reliable cross-client compatibility. See [supported schema types](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) for details.

### Security

MCP Elicitation requires careful handling - servers must not request sensitive information, and clients must implement user approval controls with clear explanations. See [security considerations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#security-considerations) for details.

---

# [FastMCP Client](https://pydantic.dev/docs/ai/mcp/fastmcp-client/)

# FastMCP Client

[FastMCP](https://gofastmcp.com/) is a higher-level MCP framework that bills itself as "The fast, Pythonic way to build MCP servers and clients." It supports additional capabilities on top of the MCP specification like [Tool Transformation](https://gofastmcp.com/patterns/tool-transformation), [OAuth](https://gofastmcp.com/clients/auth/oauth), and more.

As an alternative to Pydantic AI's standard [`MCPServer` MCP client](/docs/ai/mcp/client) built on the [MCP SDK](https://github.com/modelcontextprotocol/python-sdk), you can use the [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset) [toolset](/docs/ai/tools-toolsets/toolsets) that leverages the [FastMCP Client](https://gofastmcp.com/clients/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers/).

Note that it does not yet support integration elicitation or sampling, which are supported by the [standard `MCPServer` client](/docs/ai/mcp/client).

## Install

To use the `FastMCPToolset`, you will need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `fastmcp` optional group:

-   [pip](#tab-panel-52)
-   [uv](#tab-panel-53)

Terminal

```bash
pip install "pydantic-ai-slim[fastmcp]"
```

Terminal

```bash
uv add "pydantic-ai-slim[fastmcp]"
```

## Usage

A `FastMCPToolset` can then be created from:

-   A FastMCP Server: `FastMCPToolset(fastmcp.FastMCP('my_server'))`
-   A FastMCP Client: `FastMCPToolset(fastmcp.Client(...))`
-   A FastMCP Transport: `FastMCPToolset(fastmcp.StdioTransport(command='python', args=['mcp_server.py']))`
-   A Streamable HTTP URL: `FastMCPToolset('http://localhost:8000/mcp')`
-   An HTTP SSE URL: `FastMCPToolset('http://localhost:8000/sse')`
-   A Python Script: `FastMCPToolset('my_server.py')`
-   A Node.js Script: `FastMCPToolset('my_server.js')`
-   A JSON MCP Configuration: `FastMCPToolset({'mcpServers': {'my_server': {'command': 'python', 'args': ['mcp_server.py']}}})`

If you already have a [FastMCP Server](https://gofastmcp.com/servers) in the same codebase as your Pydantic AI agent, you can create a `FastMCPToolset` directly from it and save agent a network round trip:

```python
from fastmcp import FastMCP

from pydantic_ai import Agent
from pydantic_ai.toolsets.fastmcp import FastMCPToolset

fastmcp_server = FastMCP('my_server')
@fastmcp_server.tool()
async def add(a: int, b: int) -> int:
    return a + b

toolset = FastMCPToolset(fastmcp_server)

agent = Agent('openai:gpt-5.2', toolsets=[toolset])

async def main():
    result = await agent.run('What is 7 plus 5?')
    print(result.output)
    #> The answer is 12.
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

Connecting your agent to a Streamable HTTP MCP Server is as simple as:

```python
from pydantic_ai import Agent
from pydantic_ai.toolsets.fastmcp import FastMCPToolset

toolset = FastMCPToolset('http://localhost:8000/mcp')

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

You can also create a `FastMCPToolset` from a JSON MCP Configuration:

```python
from pydantic_ai import Agent
from pydantic_ai.toolsets.fastmcp import FastMCPToolset

mcp_config = {
    'mcpServers': {
        'time_mcp_server': {
            'command': 'uvx',
            'args': ['mcp-run-python', 'stdio']
        },
        'weather_server': {
            'command': 'python',
            'args': ['mcp_server.py']
        }
    }
}

toolset = FastMCPToolset(mcp_config)

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
```

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

---

# [Overview](https://pydantic.dev/docs/ai/mcp/overview/)

# Overview

Pydantic AI supports [Model Context Protocol (MCP)](https://modelcontextprotocol.io) in multiple ways:

1.  [Agents](/docs/ai/core-concepts/agent) can connect to MCP servers and use their tools using three different methods:
    1.  Pydantic AI can act as an MCP client and connect directly to local and remote MCP servers. [Learn more](/docs/ai/mcp/client) about [`MCPServer`](/docs/ai/api/pydantic-ai/mcp/#pydantic_ai.mcp.MCPServer).
    2.  Pydantic AI can use the [FastMCP Client](https://gofastmcp.com/clients/client/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers). [Learn more](/docs/ai/mcp/fastmcp-client) about [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset).
    3.  Some model providers can themselves connect to remote MCP servers using a "built-in tool". [Learn more](/docs/ai/tools-toolsets/builtin-tools#mcp-server-tool) about [`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool).
2.  Agents can be used within MCP servers. [Learn more](/docs/ai/mcp/server)

## What is MCP?

The Model Context Protocol is a standardized protocol that allow AI applications (including programmatic agents like Pydantic AI, coding agents like [cursor](https://www.cursor.com/), and desktop applications like [Claude Desktop](https://claude.ai/download)) to connect to external tools and services using a common interface.

As with other protocols, the dream of MCP is that a wide range of applications can speak to each other without the need for specific integrations.

There is a great list of MCP servers at [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers).

Some examples of what this means:

-   Pydantic AI could use a web search service implemented as an MCP server to implement a deep research agent
-   Cursor could connect to the [Pydantic Logfire](https://github.com/pydantic/logfire-mcp) MCP server to search logs, traces and metrics to gain context while fixing a bug
-   Pydantic AI, or any other MCP client could connect to our [Run Python](https://github.com/pydantic/mcp-run-python) MCP server to run arbitrary Python code in a sandboxed environment

---

# [Server](https://pydantic.dev/docs/ai/mcp/server/)

# Server

Pydantic AI models can also be used within MCP Servers.

## MCP Server

Here's a simple example of a [Python MCP server](https://github.com/modelcontextprotocol/python-sdk) using Pydantic AI within a tool call:

mcp\_server.py

```python
from mcp.server.fastmcp import FastMCP

from pydantic_ai import Agent

server = FastMCP('Pydantic AI Server')
server_agent = Agent(
    'anthropic:claude-haiku-4-5', instructions='always reply in rhyme'
)


@server.tool()
async def poet(theme: str) -> str:
    """Poem generator"""
    r = await server_agent.run(f'write a poem about {theme}')
    return r.output


if __name__ == '__main__':
    server.run()
```

## Simple client

This server can be queried with any MCP client. Here is an example using the Python SDK directly:

mcp\_client.py

```py
import asyncio
import os

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client


async def client():
    server_params = StdioServerParameters(
        command='python', args=['mcp_server.py'], env=os.environ
    )
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool('poet', {'theme': 'socks'})
            print(result.content[0].text)
            """
            Oh, socks, those garments soft and sweet,
            That nestle softly 'round our feet,
            From cotton, wool, or blended thread,
            They keep our toes from feeling dread.
            """


if __name__ == '__main__':
    asyncio.run(client())
```

## MCP Sampling

What is MCP Sampling?

See the [MCP client docs](/docs/ai/mcp/client#mcp-sampling) for details of what MCP sampling is, and how you can support it when using Pydantic AI as an MCP client.

When Pydantic AI agents are used within MCP servers, they can use sampling via [`MCPSamplingModel`](/docs/ai/api/models/mcp-sampling/#pydantic_ai.models.mcp_sampling.MCPSamplingModel).

We can extend the above example to use sampling so instead of connecting directly to the LLM, the agent calls back through the MCP client to make LLM calls.

mcp\_server\_sampling.py

```py
from mcp.server.fastmcp import Context, FastMCP

from pydantic_ai import Agent
from pydantic_ai.models.mcp_sampling import MCPSamplingModel

server = FastMCP('Pydantic AI Server with sampling')
server_agent = Agent(instructions='always reply in rhyme')


@server.tool()
async def poet(ctx: Context, theme: str) -> str:
    """Poem generator"""
    r = await server_agent.run(f'write a poem about {theme}', model=MCPSamplingModel(session=ctx.session))
    return r.output


if __name__ == '__main__':
    server.run()  # run the server over stdio
```

The [above](#simple-client) client does not support sampling, so if you tried to use it with this server you'd get an error.

The simplest way to support sampling in an MCP client is to [use](/docs/ai/mcp/client#mcp-sampling) a Pydantic AI agent as the client, but if you wanted to support sampling with the vanilla MCP SDK, you could do so like this:

mcp\_client\_sampling.py

```py
import asyncio
from typing import Any

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from mcp.shared.context import RequestContext
from mcp.types import (
    CreateMessageRequestParams,
    CreateMessageResult,
    ErrorData,
    TextContent,
)


async def sampling_callback(
    context: RequestContext[ClientSession, Any], params: CreateMessageRequestParams
) -> CreateMessageResult | ErrorData:
    print('sampling system prompt:', params.systemPrompt)
    #> sampling system prompt: always reply in rhyme
    print('sampling messages:', params.messages)
    """
    sampling messages:
    [
        SamplingMessage(
            role='user',
            content=TextContent(
                type='text',
                text='write a poem about socks',
                annotations=None,
                meta=None,
            ),
            meta=None,
        )
    ]
    """

    # TODO get the response content by calling an LLM...
    response_content = 'Socks for a fox.'

    return CreateMessageResult(
        role='assistant',
        content=TextContent(type='text', text=response_content),
        model='fictional-llm',
    )


async def client():
    server_params = StdioServerParameters(command='python', args=['mcp_server_sampling.py'])
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write, sampling_callback=sampling_callback) as session:
            await session.initialize()
            result = await session.call_tool('poet', {'theme': 'socks'})
            print(result.content[0].text)
            #> Socks for a fox.


if __name__ == '__main__':
    asyncio.run(client())
```

_(This example is complete, it can be run "as is")_

---

# [Anthropic](https://pydantic.dev/docs/ai/models/anthropic/)

# Anthropic

## Install

To use `AnthropicModel` models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `anthropic` optional group:

-   [pip](#tab-panel-60)
-   [uv](#tab-panel-61)

Terminal

```bash
pip install "pydantic-ai-slim[anthropic]"
```

Terminal

```bash
uv add "pydantic-ai-slim[anthropic]"
```

## Configuration

To use [Anthropic](https://anthropic.com) through their API, go to [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) to generate an API key.

`AnthropicModelName` contains a list of available Anthropic models.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export ANTHROPIC_API_KEY='your-api-key'
```

You can then use `AnthropicModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('anthropic:claude-sonnet-4-6')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel

model = AnthropicModel('claude-sonnet-4-5')
agent = Agent(model)
...
```

Claude Opus 4.7 migration

Anthropic's [Claude Opus 4.7 migration guide](https://platform.claude.com/docs/en/about-claude/models/migration-guide) recommends removing `temperature`, `top_p`, and `top_k` from Opus 4.7 requests. Pydantic AI drops those keys automatically for `claude-opus-4-7`, including `extra_body` overrides.

The same guide also recommends re-evaluating `max_tokens` and any token-count assumptions when migrating from Opus 4.6, since Opus 4.7 uses updated tokenization. If you rely on `count_tokens()` or `count_tokens_before_request`, verify your thresholds against the new model.

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

model = AnthropicModel(
    'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key')
)
agent = Agent(model)
...
```

## Custom HTTP Client

You can customize the `AnthropicProvider` with a custom `httpx.AsyncClient`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

custom_http_client = AsyncClient(timeout=30)
model = AnthropicModel(
    'claude-sonnet-4-5',
    provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

## Model settings

You can customize model behavior using [`AnthropicModelSettings`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings):

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings

model = AnthropicModel('claude-sonnet-4-5')
settings = AnthropicModelSettings(
    temperature=0.2,
    service_tier='auto',
)
agent = Agent(model, model_settings=settings)
...
```

### Service tier

Anthropic supports controlling the [service tier](https://docs.anthropic.com/en/docs/build-with-claude/latency-and-throughput) to manage latency and throughput. You can use the unified [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier) field or the provider-specific [`anthropic_service_tier`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_service_tier) field. `anthropic_service_tier` takes precedence over the unified field when both are set, and accepts Anthropic's native values (`'auto'` or `'standard_only'`).

The unified field maps as follows for Anthropic:

-   `'auto'`: passed through as `'auto'` (Anthropic's native value -- uses priority capacity when available).
-   `'default'`: maps to `'standard_only'` (forces the standard tier, opting out of priority capacity).
-   `'flex'` and `'priority'` are not part of Anthropic's tier model and are silently ignored.

## Cloud Platform Integrations

You can use Anthropic models through cloud platforms by passing a custom client to [`AnthropicProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.anthropic.AnthropicProvider).

### AWS Bedrock

To use Claude models via [AWS Bedrock](https://aws.amazon.com/bedrock/claude/), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock) on how to set up an `AsyncAnthropicBedrock` client and then pass it to `AnthropicProvider`:

```python
from anthropic import AsyncAnthropicBedrock

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

bedrock_client = AsyncAnthropicBedrock()  # Uses AWS credentials from environment
provider = AnthropicProvider(anthropic_client=bedrock_client)
model = AnthropicModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0', provider=provider)
agent = Agent(model)
...
```

Bedrock vs BedrockConverseModel

This approach uses Anthropic's SDK with AWS Bedrock credentials. For an alternative using AWS SDK (boto3) directly, see [`BedrockConverseModel`](/docs/ai/models/bedrock).

### Google Vertex AI

To use Claude models via [Google Cloud Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-vertex-ai) on how to set up an `AsyncAnthropicVertex` client and then pass it to `AnthropicProvider`:

```python
from anthropic import AsyncAnthropicVertex

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

vertex_client = AsyncAnthropicVertex(region='us-east5', project_id='your-project-id')
provider = AnthropicProvider(anthropic_client=vertex_client)
model = AnthropicModel('claude-sonnet-4-5', provider=provider)
agent = Agent(model)
...
```

Vertex vs GoogleModel

This approach uses Anthropic's SDK with Vertex AI credentials. For an alternative using Google's Vertex AI SDK directly, see [`GoogleModel`](/docs/ai/models/google).

### Microsoft Foundry

To use Claude models via [Microsoft Foundry](https://ai.azure.com/), follow the [Anthropic documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) on how to set up an `AsyncAnthropicFoundry` client and then pass it to `AnthropicProvider`:

```python
from anthropic import AsyncAnthropicFoundry

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.anthropic import AnthropicProvider

foundry_client = AsyncAnthropicFoundry(
    api_key='your-foundry-api-key',  # Or set ANTHROPIC_FOUNDRY_API_KEY
    resource='your-resource-name',
)
provider = AnthropicProvider(anthropic_client=foundry_client)
model = AnthropicModel('claude-sonnet-4-5', provider=provider)
agent = Agent(model)
...
```

See [Anthropic's Microsoft Foundry documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) for setup instructions including Entra ID authentication.

## Task Budgets (Beta)

Anthropic's [task budgets](https://platform.claude.com/docs/en/build-with-claude/task-budgets) let you give Claude an advisory token budget for a full agentic loop -- including thinking, tool calls, tool results, and output -- so the model can pace itself and finish gracefully as the budget is consumed. Configure them with [`AnthropicModelSettings.anthropic_task_budget`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_task_budget), which takes an [`AnthropicTaskBudget`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicTaskBudget) payload and maps to `output_config.task_budget`.

Pydantic AI automatically enables Anthropic's required `task-budgets-2026-03-13` beta when this setting is present. Support is currently limited to native Anthropic `claude-opus-4-7` requests, not Bedrock, Vertex, or Microsoft Foundry Anthropic model IDs.

anthropic\_task\_budget.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings

model = AnthropicModel('claude-opus-4-7')
settings = AnthropicModelSettings(
    anthropic_task_budget={'type': 'tokens', 'total': 20_000},
)
agent = Agent(model, model_settings=settings)
...
```

Task budgets compose with [`anthropic_effort`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_effort): effort tunes per-step reasoning depth, while task budgets cap total work across the loop. Both fields end up under the same `output_config` object.

Note

Task budgets are advisory, not a hard cap; pair them with [`max_tokens`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.max_tokens) for an enforced ceiling.

### Carrying budgets across compaction

If you use [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) for server-side compaction, you can skip this section: the server tracks the countdown itself, so leave `remaining` unset and let `total` self-regulate.

The `remaining` field on `task_budget` is for _client-side_ compaction patterns where you summarize earlier turns yourself between requests, so the server has no memory of how much budget was spent before the rewrite. Pydantic AI does not track `remaining` for you -- accumulate token usage across requests yourself (e.g. from [`RunUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RunUsage) on each run) and pass the updated value on the next request so the countdown continues from where you left off rather than resetting to `total`. Setting `remaining` also invalidates any prompt-cache prefix that contains the budget, so if you want to preserve caching, set `total` once and let the server self-regulate against the running countdown.

Caution

`task_budget.remaining` is mutually exclusive with [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction): Anthropic rejects requests that combine the two because server-side compaction tracks the budget itself. Pydantic AI raises a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) before sending the request when this combination is configured. Choose one: `remaining` for client-side budget tracking, or [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) for server-side compaction.

## Prompt Caching

Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI supports automatic caching, per-block message caching, and explicit cache breakpoints:

### Automatic Caching

The simplest way to enable prompt caching is with [`AnthropicModelSettings.anthropic_cache`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache). This uses Anthropic's [automatic caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching), passing a top-level `cache_control` parameter so the server automatically applies a cache breakpoint to the last cacheable block in each request:

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='You are a helpful assistant.',
    model_settings=AnthropicModelSettings(
        anthropic_cache=True,
    ),
)

result1 = agent.run_sync('What is the capital of France?')

result2 = agent.run_sync(
    'What is the capital of Germany?', message_history=result1.all_messages()
)
print(f'Cache write: {result1.usage().cache_write_tokens}')
print(f'Cache read: {result2.usage().cache_read_tokens}')
```

This is ideal for multi-turn conversations where the cache breakpoint should move forward as the conversation grows. You can also specify a custom TTL with `anthropic_cache='1h'`.

Bedrock and Vertex

Bedrock and Vertex [do not yet support automatic caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#automatic-caching). On these platforms, `anthropic_cache` falls back to per-block caching on the last user message, providing the same benefit for multi-turn conversations.

### Per-block Message Caching

As an alternative to `anthropic_cache`, [`AnthropicModelSettings.anthropic_cache_messages`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_messages) adds per-block `cache_control` to the last content block of the final message instead of using Anthropic's top-level automatic caching parameter. Use this with Anthropic-compatible gateways and proxies (such as MiniMax, OpenRouter, or LiteLLM) that accept the Anthropic message format but don't support top-level automatic caching:

```python
from anthropic import AsyncAnthropic

from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings
from pydantic_ai.providers.anthropic import AnthropicProvider

client = AsyncAnthropic(
    api_key='your-api-key',
    base_url='https://your-anthropic-compatible-gateway.example.com',
)

model = AnthropicModel(
    'claude-sonnet-4-6',
    provider=AnthropicProvider(anthropic_client=client),
)
agent = Agent(
    model,
    model_settings=AnthropicModelSettings(
        anthropic_cache_messages=True,
    ),
)

result = agent.run_sync('What is the capital of France?')
print(result.output)
```

You can also specify a custom TTL with `anthropic_cache_messages='1h'`. `anthropic_cache_messages` cannot be combined with `anthropic_cache`.

### Explicit Cache Breakpoints

In addition to automatic caching, Pydantic AI provides several ways to place cache breakpoints on specific content:

1.  **Cache User Messages with [`CachePoint`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CachePoint)**: Insert a `CachePoint` marker in your user messages to cache everything before it
2.  **Cache the Final Message Block**: Set [`AnthropicModelSettings.anthropic_cache_messages`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_messages) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
3.  **Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
4.  **Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly

#### Example: Comprehensive Caching Strategy

Combine automatic caching with explicit breakpoints for maximum savings. Automatic caching handles the conversation, while explicit breakpoints pin system instructions and tool definitions:

```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Detailed instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache=True,                   # Server auto-caches last block
        anthropic_cache_instructions=True,      # Explicitly cache system instructions
        anthropic_cache_tool_definitions='1h',  # Explicitly cache tool definitions with 1h TTL
    ),
)

@agent.tool
def search_docs(ctx: RunContext, query: str) -> str:
    """Search documentation."""
    return f'Results for {query}'


result = agent.run_sync('Search for Python best practices')
print(result.output)
```

### Smart Instruction Caching

When you use `anthropic_cache_instructions` with both static and dynamic [instructions](/docs/ai/core-concepts/agent#instructions), Pydantic AI automatically places the cache boundary at the optimal point. Static instructions (from `Agent(instructions=...)`) are sorted before dynamic instructions (from `@agent.instructions` functions or [toolsets](/docs/ai/tools-toolsets/toolsets)), and the cache point is placed after the last static instruction block.

This means your stable, static instructions are cached efficiently, while dynamic instructions (which may change between requests) remain outside the cache boundary and don't cause cache invalidation.

```python
from datetime import date

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
  'anthropic:claude-sonnet-4-6',
  deps_type=str,
  instructions='You are a helpful customer service agent. Follow company policy.',  # (1)
  model_settings=AnthropicModelSettings(
      anthropic_cache_instructions=True,  # (2)
  ),
)


@agent.instructions
def dynamic_context(ctx: RunContext[str]) -> str:  # (3)
  return f"Customer name: {ctx.deps}. Today's date: {date.today()}."


result = agent.run_sync('What is your return policy?', deps='Alice')
print(result.output)
```

Static instructions are cached across requests.

Enables smart cache placement at the static/dynamic boundary.

Dynamic instructions change per-request and are not cached.

### Fine-Grained Control with CachePoint

Use manual `CachePoint` markers to control cache locations precisely:

```python
from pydantic_ai import Agent, CachePoint

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Instructions...',
)

# Manually control cache points for specific content blocks
result = agent.run_sync([
    'Long context from documentation...',
    CachePoint(),  # Cache everything up to this point
    'First question'
])
print(result.output)
```

### Accessing Cache Usage Statistics

Access cache usage statistics via `result.usage()`:

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache=True,
    ),
)

result = agent.run_sync('Your question')
usage = result.usage()
print(f'Cache write tokens: {usage.cache_write_tokens}')
print(f'Cache read tokens: {usage.cache_read_tokens}')
```

### Cache Point Limits

Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors.

#### How Cache Points Are Allocated

Cache points can come from several sources:

1.  **Automatic caching**: Via `anthropic_cache` (the server applies 1 cache point to the last cacheable block)
2.  **Final message block**: Via `anthropic_cache_messages` setting (adds cache point to last message content block)
3.  **System Prompt**: Via `anthropic_cache_instructions` setting (adds cache point to last system prompt block)
4.  **Tool Definitions**: Via `anthropic_cache_tool_definitions` setting (adds cache point to last tool definition)
5.  **Messages**: Via `CachePoint` markers (adds cache points to message content)

Each setting uses **at most 1 cache point**, but you can combine them -- except `anthropic_cache` and `anthropic_cache_messages`, which are mutually exclusive. If the total exceeds 4, Pydantic AI automatically trims excess cache points from older messages.

#### Example: Combining Automatic and Explicit Caching

Define an agent with automatic caching plus explicit breakpoints:

```python
from pydantic_ai import Agent, CachePoint
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Detailed instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache=True,                   # 1 cache point (server-applied)
        anthropic_cache_instructions=True,      # 1 cache point
        anthropic_cache_tool_definitions=True,  # 1 cache point
    ),
)

@agent.tool_plain
def my_tool() -> str:
    return 'result'


# 3 of 4 slots used (1 automatic + 1 instructions + 1 tools)
# Room for 1 more explicit CachePoint marker
result = agent.run_sync([
    'Context', CachePoint(),  # 4th cache point - OK
    'Question'
])
print(result.output)
usage = result.usage()
print(f'Cache write tokens: {usage.cache_write_tokens}')
print(f'Cache read tokens: {usage.cache_read_tokens}')
```

#### Automatic Cache Point Limiting

When explicit cache points from all sources (settings + `CachePoint` markers) exceed the available budget, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones).

Define an agent with 2 explicit cache points from settings:

```python
from pydantic_ai import Agent, CachePoint
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Instructions...',
    model_settings=AnthropicModelSettings(
        anthropic_cache_instructions=True,      # 1 cache point
        anthropic_cache_tool_definitions=True,  # 1 cache point
    ),
)

@agent.tool_plain
def search() -> str:
    return 'data'

# Already using 2 cache points (instructions + tools)
# Can add 2 more CachePoint markers (4 total limit)
result = agent.run_sync([
    'Context 1', CachePoint(),  # Oldest - will be removed
    'Context 2', CachePoint(),  # Will be kept (3rd point)
    'Context 3', CachePoint(),  # Will be kept (4th point)
    'Question'
])
# Final cache points: instructions + tools + Context 2 + Context 3 = 4
print(result.output)
usage = result.usage()
print(f'Cache write tokens: {usage.cache_write_tokens}')
print(f'Cache read tokens: {usage.cache_read_tokens}')
```

**Key Points**:

-   System and tool cache points are **always preserved**
-   `anthropic_cache` counts as 1 cache point, just like `anthropic_cache_instructions` and `anthropic_cache_tool_definitions`
-   Excess `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded
-   This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching

## Fast mode

Fast mode provides higher output tokens per second and is currently supported on **Claude Opus 4.6** (`anthropic:claude-opus-4-6`) only. It is a research preview. Set [`anthropic_speed`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_speed) to `'fast'` to enable it; Pydantic AI automatically adds the required `fast-mode-2026-02-01` beta. On unsupported models, `anthropic_speed='fast'` is ignored with a `UserWarning`. For pricing, rate limits, and the latest list of supported models, see the [Anthropic fast mode docs](https://platform.claude.com/docs/en/build-with-claude/fast-mode).

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-opus-4-6',
    model_settings=AnthropicModelSettings(anthropic_speed='fast'),
)
...
```

Prompt cache interaction

Switching between `'fast'` and `'standard'` invalidates the prompt cache. Requests at different speeds do not share cached prefixes, so pick one speed per cache-sensitive conversation.

Bedrock, Vertex, and Foundry

Fast mode is only available on the direct Anthropic API. Bedrock, Vertex, and Foundry clients do not support the `speed` parameter, so `anthropic_speed='fast'` is ignored with a `UserWarning` on those clients.

## Message Compaction

Anthropic supports [automatic context compaction](https://docs.anthropic.com/en/docs/build-with-claude/compaction) to manage long conversations. When input tokens exceed a configured threshold, the API automatically generates a summary that replaces older messages while preserving context.

The easiest way to enable compaction is with the [`AnthropicCompaction`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicCompaction) capability:

anthropic\_compaction.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicCompaction

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    capabilities=[AnthropicCompaction(token_threshold=100_000)],
)
```

The capability accepts:

-   **`token_threshold`** (default: 150,000, minimum: 50,000): Compaction triggers when input tokens exceed this value.
-   **`instructions`**: Custom instructions for how the summary should be generated.
-   **`pause_after_compaction`**: When `True`, the response stops after the compaction block with `stop_reason='compaction'`, allowing explicit handling before continuing.

Alternatively, you can configure compaction directly via model settings using [`anthropic_context_management`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_context_management):

anthropic\_compaction\_settings.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent('anthropic:claude-sonnet-4-6')
result = agent.run_sync(
    'Hello!',
    model_settings=AnthropicModelSettings(
        anthropic_context_management={
            'edits': [{'type': 'compact_20260112', 'trigger': {'type': 'input_tokens', 'value': 100_000}}]
        }
    ),
)
```

Note

Compaction blocks returned by Anthropic contain readable text summaries. They are automatically round-tripped in subsequent requests when included in the message history.

## Code Execution Tool Version

By default, Pydantic AI chooses a compatible Anthropic code execution tool version for the selected model. You can override this with [`AnthropicModelSettings.anthropic_code_execution_tool_version`](/docs/ai/api/models/anthropic/#pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_code_execution_tool_version) when you need a specific supported Anthropic tool version:

anthropic\_code\_execution\_tool\_version.py

```python
from pydantic_ai import Agent, CodeExecutionTool
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    builtin_tools=[CodeExecutionTool()],
    model_settings=AnthropicModelSettings(anthropic_code_execution_tool_version='20260120'),
)
```

Pydantic AI raises a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) if you explicitly select a tool version that the model does not support.

---

# [Bedrock](https://pydantic.dev/docs/ai/models/bedrock/)

# Bedrock

## Install

To use `BedrockConverseModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group:

-   [pip](#tab-panel-66)
-   [uv](#tab-panel-67)

Terminal

```bash
pip install "pydantic-ai-slim[bedrock]"
```

Terminal

```bash
uv add "pydantic-ai-slim[bedrock]"
```

## Configuration

To use [AWS Bedrock](https://aws.amazon.com/bedrock/), you'll need an AWS account with Bedrock enabled and appropriate credentials. You can use either AWS credentials directly or a pre-configured boto3 client.

`BedrockModelName` contains a list of available Bedrock models, including models from Anthropic, Amazon, Cohere, Meta, and Mistral.

## Environment variables

You can set your AWS credentials as environment variables ([among other options](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables)):

Terminal

```bash
export AWS_BEARER_TOKEN_BEDROCK='your-api-key'
# or:
export AWS_ACCESS_KEY_ID='your-access-key'
export AWS_SECRET_ACCESS_KEY='your-secret-key'
export AWS_DEFAULT_REGION='us-east-1'  # or your preferred region
```

You can then use `BedrockConverseModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0')
...
```

Or initialize the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel

model = BedrockConverseModel('anthropic.claude-sonnet-4-5-20250929-v1:0')
agent = Agent(model)
...
```

## Customizing Bedrock Runtime API

You can customize the Bedrock Runtime API calls by adding additional parameters, such as [guardrail configurations](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) and [performance settings](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html). For a complete list of configurable parameters, refer to the documentation for [`BedrockModelSettings`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings).

customize\_bedrock\_model\_settings.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings

# Define Bedrock model settings with guardrail and performance configurations
bedrock_model_settings = BedrockModelSettings(
    bedrock_guardrail_config={
        'guardrailIdentifier': 'v1',
        'guardrailVersion': 'v1',
        'trace': 'enabled'
    },
    bedrock_performance_configuration={
        'latency': 'optimized'
    }
)


model = BedrockConverseModel(model_name='us.amazon.nova-pro-v1:0')

agent = Agent(model=model, model_settings=bedrock_model_settings)
```

## Service tier

Bedrock supports controlling the [service tier](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html) to manage throughput and cost. You can use the unified [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier) field or the provider-specific [`bedrock_service_tier`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_service_tier) field. `bedrock_service_tier` takes precedence over the unified field when both are set.

The unified field maps as follows for Bedrock:

-   `'auto'`: the `serviceTier` field is omitted from the request, so AWS applies its server-side default (Standard tier).
-   `'default'`: explicitly sent as `{'type': 'default'}` -- opts out of any future server-side auto-promotion to premium tiers.
-   `'flex'`: sent as `{'type': 'flex'}`.
-   `'priority'`: sent as `{'type': 'priority'}`.

To request Bedrock's `'reserved'` tier (which requires a pre-purchased capacity reservation), set [`bedrock_service_tier`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_service_tier) directly -- it isn't reachable through the unified field.

## Prompt Caching

Bedrock supports [prompt caching](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) on Anthropic models so you can reuse expensive context across requests. Pydantic AI provides four ways to use prompt caching:

1.  **Cache User Messages with [`CachePoint`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.CachePoint)**: Insert a `CachePoint` marker to cache everything before it in the current user message. Pass `CachePoint(ttl='1h')` to opt into the extended cache duration.
2.  **Cache System Instructions**: Set [`BedrockModelSettings.bedrock_cache_instructions`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_instructions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly. When you have both static and dynamic [instructions](/docs/ai/core-concepts/agent#instructions), the cache point is placed after the last static instruction, so dynamic instructions can change without invalidating the static cache.
3.  **Cache Tool Definitions**: Set [`BedrockModelSettings.bedrock_cache_tool_definitions`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_tool_definitions) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly.
4.  **Cache All Messages**: Set [`BedrockModelSettings.bedrock_cache_messages`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_cache_messages) to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly to automatically cache the last user message.

Minimum Token Threshold

AWS only serves cached content once a segment crosses the provider-specific minimum token thresholds (see the [Bedrock prompt caching docs](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)). Short prompts or tool definitions below those limits will bypass the cache, so don't expect savings for tiny payloads.

### Example 1: Automatic Message Caching

Use `bedrock_cache_messages` to automatically cache the last user message:

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockModelSettings

agent = Agent(
    'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0',
    system_prompt='You are a helpful assistant.',
    model_settings=BedrockModelSettings(
        bedrock_cache_messages=True,  # Automatically caches the last message
    ),
)

# The last message is automatically cached - no need for manual CachePoint
result1 = agent.run_sync('What is the capital of France?')

# Subsequent calls with similar conversation benefit from cache
result2 = agent.run_sync('What is the capital of Germany?')
print(f'Cache write: {result1.usage().cache_write_tokens}')
print(f'Cache read: {result2.usage().cache_read_tokens}')
```

### Example 2: Comprehensive Caching Strategy

Combine multiple cache settings for maximum savings:

```python
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings

model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0')
agent = Agent(
    model,
    system_prompt='Detailed instructions...',
    model_settings=BedrockModelSettings(
        bedrock_cache_instructions=True,       # Cache system instructions
        bedrock_cache_tool_definitions='1h',   # Cache tool definitions with 1h TTL
        bedrock_cache_messages=True,           # Also cache the last message
    ),
)


@agent.tool
def search_docs(ctx: RunContext, query: str) -> str:
    """Search documentation."""
    return f'Results for {query}'


result = agent.run_sync('Search for Python best practices')
print(result.output)
```

### Example 3: Fine-Grained Control with CachePoint

Use manual `CachePoint` markers to control cache locations precisely:

```python
from pydantic_ai import Agent, CachePoint

agent = Agent(
    'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0',
    system_prompt='Instructions...',
)

# Manually control cache points for specific content blocks
result = agent.run_sync([
    'Long context from documentation...',
    CachePoint(),  # Cache everything up to this point
    'First question'
])
print(result.output)
```

### Accessing Cache Usage Statistics

Access cache usage statistics via [`RequestUsage`](/docs/ai/api/pydantic-ai/usage/#pydantic_ai.usage.RequestUsage):

```python
from pydantic_ai import Agent, CachePoint

agent = Agent('bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0')


async def main():
    result = await agent.run(
        [
            'Reference material...',
            CachePoint(),
            'What changed since last time?',
        ]
    )
    usage = result.usage()
    print(f'Cache writes: {usage.cache_write_tokens}')
    print(f'Cache reads: {usage.cache_read_tokens}')
```

### Cache Point Limits

Bedrock enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors.

#### How Cache Points Are Allocated

Cache points can be placed in three locations:

1.  **System Prompt**: Via `bedrock_cache_instructions` setting (adds cache point to last system prompt block)
2.  **Tool Definitions**: Via `bedrock_cache_tool_definitions` setting (adds cache point to last tool definition)
3.  **Messages**: Via `CachePoint` markers or `bedrock_cache_messages` setting (adds cache points to message content)

Each setting uses **at most 1 cache point**, but you can combine them.

#### Automatic Cache Point Limiting

When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones).

```python
from pydantic_ai import Agent, CachePoint
from pydantic_ai.models.bedrock import BedrockModelSettings

agent = Agent(
    'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0',
    system_prompt='Instructions...',
    model_settings=BedrockModelSettings(
        bedrock_cache_instructions=True,      # 1 cache point
        bedrock_cache_tool_definitions=True,  # 1 cache point
    ),
)

@agent.tool_plain
def search() -> str:
    return 'data'


# Already using 2 cache points (instructions + tools)
# Can add 2 more CachePoint markers (4 total limit)
result = agent.run_sync([
    'Context 1', CachePoint(),  # Oldest - will be removed
    'Context 2', CachePoint(),  # Will be kept (3rd point)
    'Context 3', CachePoint(),  # Will be kept (4th point)
    'Question'
])
# Final cache points: instructions + tools + Context 2 + Context 3 = 4
print(result.output)
```

**Key Points**:

-   System and tool cache points are **always preserved**
-   The cache point created by `bedrock_cache_messages` is **always preserved** (as it's the newest message cache point)
-   Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded
-   This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching

## `provider` argument

You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client:

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel
from pydantic_ai.providers.bedrock import BedrockProvider

# Using AWS credentials directly
model = BedrockConverseModel(
    'anthropic.claude-sonnet-4-5-20250929-v1:0',
    provider=BedrockProvider(
        region_name='us-east-1',
        aws_access_key_id='your-access-key',
        aws_secret_access_key='your-secret-key',
    ),
)
agent = Agent(model)
...
```

You can also pass a pre-configured boto3 client:

```python
import boto3

from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel
from pydantic_ai.providers.bedrock import BedrockProvider

# Using a pre-configured boto3 client
bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')
model = BedrockConverseModel(
    'anthropic.claude-sonnet-4-5-20250929-v1:0',
    provider=BedrockProvider(bedrock_client=bedrock_client),
)
agent = Agent(model)
...
```

## Using AWS Application Inference Profiles

AWS Bedrock supports [custom application inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-create.html) for cost tracking and resource management. Set [`bedrock_inference_profile`](/docs/ai/api/models/bedrock/#pydantic_ai.models.bedrock.BedrockModelSettings.bedrock_inference_profile) to route requests through an inference profile while keeping the base model name for detecting model capabilities:

```python
from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel
from pydantic_ai.providers.bedrock import BedrockProvider

provider = BedrockProvider(region_name='us-east-2')

model = BedrockConverseModel(
    'us.anthropic.claude-opus-4-5-20251101-v1:0',
    provider=provider,
    settings={
        'bedrock_inference_profile': 'arn:aws:bedrock:us-east-2:123456789012:application-inference-profile/my-profile',
    },
)

agent = Agent(model)
```

## Configuring Retries

Bedrock uses boto3's built-in retry mechanisms. You can configure retry behavior by passing a custom boto3 client with retry settings:

```python
import boto3
from botocore.config import Config

from pydantic_ai import Agent
from pydantic_ai.models.bedrock import BedrockConverseModel
from pydantic_ai.providers.bedrock import BedrockProvider

# Configure retry settings
config = Config(
    retries={
        'max_attempts': 5,
        'mode': 'adaptive'  # Recommended for rate limiting
    }
)

bedrock_client = boto3.client(
    'bedrock-runtime',
    region_name='us-east-1',
    config=config
)

model = BedrockConverseModel(
    'us.amazon.nova-micro-v1:0',
    provider=BedrockProvider(bedrock_client=bedrock_client),
)
agent = Agent(model)
```

### Retry Modes

-   `'legacy'` (default): 5 attempts, basic retry behavior
-   `'standard'`: 3 attempts, more comprehensive error coverage
-   `'adaptive'`: 3 attempts with client-side rate limiting (recommended for handling `ThrottlingException`)

For more details on boto3 retry configuration, see the [AWS boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html).

Note

Unlike other providers that use httpx for HTTP requests, Bedrock uses boto3's native retry mechanisms. The retry strategies described in [HTTP Request Retries](/docs/ai/advanced-features/retries) do not apply to Bedrock.

---

# [Cerebras](https://pydantic.dev/docs/ai/models/cerebras/)

# Cerebras

## Install

To use `CerebrasModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cerebras` optional group:

-   [pip](#tab-panel-68)
-   [uv](#tab-panel-69)

Terminal

```bash
pip install "pydantic-ai-slim[cerebras]"
```

Terminal

```bash
uv add "pydantic-ai-slim[cerebras]"
```

## Configuration

To use [Cerebras](https://cerebras.ai/) through their API, go to [cloud.cerebras.ai](https://cloud.cerebras.ai/?utm_source=3pi_pydantic-ai&utm_campaign=partner_doc) and generate an API key.

For a list of available models, see the [Cerebras models documentation](https://inference-docs.cerebras.ai/models).

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export CEREBRAS_API_KEY='your-api-key'
```

You can then use `CerebrasModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('cerebras:llama-3.3-70b')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.cerebras import CerebrasModel

model = CerebrasModel('llama-3.3-70b')
agent = Agent(model)
...
```

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.cerebras import CerebrasModel
from pydantic_ai.providers.cerebras import CerebrasProvider

model = CerebrasModel(
    'llama-3.3-70b', provider=CerebrasProvider(api_key='your-api-key')
)
agent = Agent(model)
...
```

You can also customize the `CerebrasProvider` with a custom `httpx.AsyncClient`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.cerebras import CerebrasModel
from pydantic_ai.providers.cerebras import CerebrasProvider

custom_http_client = AsyncClient(timeout=30)
model = CerebrasModel(
    'llama-3.3-70b',
    provider=CerebrasProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

---

# [Cohere](https://pydantic.dev/docs/ai/models/cohere/)

# Cohere

## Install

To use `CohereModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group:

-   [pip](#tab-panel-70)
-   [uv](#tab-panel-71)

Terminal

```bash
pip install "pydantic-ai-slim[cohere]"
```

Terminal

```bash
uv add "pydantic-ai-slim[cohere]"
```

## Configuration

To use [Cohere](https://cohere.com/) through their API, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key.

`CohereModelName` contains a list of the most popular Cohere models.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export CO_API_KEY='your-api-key'
```

You can then use `CohereModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('cohere:command-r7b-12-2024')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.cohere import CohereModel

model = CohereModel('command-r7b-12-2024')
agent = Agent(model)
...
```

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.cohere import CohereModel
from pydantic_ai.providers.cohere import CohereProvider

model = CohereModel('command-r7b-12-2024', provider=CohereProvider(api_key='your-api-key'))
agent = Agent(model)
...
```

You can also customize the `CohereProvider` with a custom `http_client`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.cohere import CohereModel
from pydantic_ai.providers.cohere import CohereProvider

custom_http_client = AsyncClient(timeout=30)
model = CohereModel(
    'command-r7b-12-2024',
    provider=CohereProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

---

# [Google](https://pydantic.dev/docs/ai/models/google/)

# Google

The `GoogleModel` is a model that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI.

## Streaming cancellation

Cancellation limitations

The `google-genai` SDK exposes streaming responses only as an async iterator, with no separate handle for closing the underlying HTTP transport. Because of a [Python language rule on async generators](https://peps.python.org/pep-0525/), [`cancel()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.cancel) cannot interrupt an in-flight chunk read while another coroutine is iterating the stream. Pydantic AI marks the response with `state='interrupted'`, but upstream generation may continue until the surrounding `async with agent.run_stream(...)` block exits.

For reliable cancellation, either pass `debounce_by=None` to [`stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text), [`stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output), or [`stream_responses()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) and call `cancel()` from the same task that's iterating:

cancel\_google.py

```python
from pydantic_ai import Agent

agent = Agent('google-gla:gemini-3-pro-preview')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async for chunk in result.stream_text(debounce_by=None):
            if should_stop(chunk):
                await result.cancel()
                break
```

Or, if you need to keep debouncing, wrap the stream with [`contextlib.aclosing`](https://docs.python.org/3/library/contextlib.html#contextlib.aclosing) so the iterator is closed before `cancel()` runs:

cancel\_google\_aclosing.py

```python
from contextlib import aclosing

from pydantic_ai import Agent

agent = Agent('google-gla:gemini-3-pro-preview')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async with aclosing(result.stream_text()) as stream:
            async for chunk in stream:
                if should_stop(chunk):
                    break
        await result.cancel()
```

Calling `cancel()` from a different task while iteration is in progress is not currently reliable on this provider.

## Install

To use `GoogleModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group:

-   [pip](#tab-panel-62)
-   [uv](#tab-panel-63)

Terminal

```bash
pip install "pydantic-ai-slim[google]"
```

Terminal

```bash
uv add "pydantic-ai-slim[google]"
```

## Configuration

`GoogleModel` lets you use Google's Gemini models through their [Generative Language API](https://ai.google.dev/api/all-methods) (`generativelanguage.googleapis.com`) or [Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) (`*-aiplatform.googleapis.com`).

### API Key (Generative Language API)

To use Gemini via the Generative Language API, go to [aistudio.google.com](https://aistudio.google.com/apikey) and create an API key.

Once you have the API key, set it as an environment variable:

Terminal

```bash
export GOOGLE_API_KEY=your-api-key
```

You can then use `GoogleModel` by name (where GLA stands for Generative Language API):

```python
from pydantic_ai import Agent

agent = Agent('google-gla:gemini-3-pro-preview')
...
```

Or you can explicitly create the provider:

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(api_key='your-api-key')
model = GoogleModel('gemini-3-pro-preview', provider=provider)
agent = Agent(model)
...
```

### Vertex AI (Enterprise/Cloud)

If you are an enterprise user, you can also use `GoogleModel` to access Gemini via Vertex AI.

This interface has a number of advantages over the Generative Language API:

1.  The VertexAI API comes with more enterprise readiness guarantees.
2.  You can [purchase provisioned throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput#purchase-provisioned-throughput) with Vertex AI to guarantee capacity.
3.  If you're running Pydantic AI inside GCP, you don't need to set up authentication, it should "just work".
4.  You can decide which region to use, which might be important from a regulatory perspective, and might improve latency.

You can authenticate using [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials), a service account, or an [API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode).

Whichever way you authenticate, you'll need to have Vertex AI enabled in your GCP account.

#### Application Default Credentials

If you have the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) installed and configured, you can use `GoogleProvider` in Vertex AI mode by name:

```python
from pydantic_ai import Agent

agent = Agent('google-vertex:gemini-3-pro-preview')
...
```

Or you can explicitly create the provider and model:

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(vertexai=True)
model = GoogleModel('gemini-3-pro-preview', provider=provider)
agent = Agent(model)
...
```

#### Service Account

To use a service account JSON file, explicitly create the provider and model:

google\_model\_service\_account.py

```python
from google.oauth2 import service_account

from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

credentials = service_account.Credentials.from_service_account_file(
    'path/to/service-account.json',
    scopes=['https://www.googleapis.com/auth/cloud-platform'],
)
provider = GoogleProvider(credentials=credentials, project='your-project-id')
model = GoogleModel('gemini-3-flash-preview', provider=provider)
agent = Agent(model)
...
```

#### API Key

To use Vertex AI with an API key, [create a key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode) and set it as an environment variable:

Terminal

```bash
export GOOGLE_API_KEY=your-api-key
```

You can then use `GoogleModel` in Vertex AI mode by name:

```python
from pydantic_ai import Agent

agent = Agent('google-vertex:gemini-3-pro-preview')
...
```

Or you can explicitly create the provider and model:

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(vertexai=True, api_key='your-api-key')
model = GoogleModel('gemini-3-pro-preview', provider=provider)
agent = Agent(model)
...
```

#### Customizing Location or Project

You can specify the location and/or project when using Vertex AI:

google\_model\_location.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(vertexai=True, location='asia-east1', project='your-gcp-project-id')
model = GoogleModel('gemini-3-pro-preview', provider=provider)
agent = Agent(model)
...
```

#### Service tier (`service_tier`, `google_vertex_service_tier`)

The unified [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier) field works on both Google subsystems, with [`google_vertex_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_vertex_service_tier) available for finer Vertex routing control. The provider-specific field wins when both are set.

**Gemini API (GLA)** -- sent as the request's `service_tier` field:

`service_tier`

Sent to Gemini API

`'auto'`

_(omitted -- server default)_

`'default'`

`'standard'`

`'flex'`

`'flex'`

`'priority'`

`'priority'`

**Vertex AI** -- sent as HTTP routing headers; `'flex'` and `'priority'` always pick the **PT-with-spillover** variant, so customers with [Provisioned Throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput/use-provisioned-throughput) (PT) keep using their reserved capacity first:

`service_tier`

Vertex routing headers

Effective behavior

`'auto'` / `'default'`

_(none)_

PT first, then standard on-demand spillover

`'flex'`

`X-Vertex-AI-LLM-Shared-Request-Type: flex`

PT first, then [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) spillover

`'priority'`

`X-Vertex-AI-LLM-Shared-Request-Type: priority`

PT first, then [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) spillover

To bypass PT entirely (or use it exclusively, or any of the other Vertex-specific routing combinations) set [`google_vertex_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_vertex_service_tier) directly -- the unified field is intentionally limited to the safe PT-with-spillover variants.

**Vertex AI -- full set of routing values**

The full [`google_vertex_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_vertex_service_tier) values map to these HTTP headers:

-   `'pt_only'`: PT only (`X-Vertex-AI-LLM-Request-Type: dedicated`).
-   `'pt_then_flex'`: PT when quota allows, then [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) spillover (`X-Vertex-AI-LLM-Shared-Request-Type: flex`).
-   `'pt_then_priority'`: PT when quota allows, then [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) spillover (`X-Vertex-AI-LLM-Shared-Request-Type: priority`).
-   `'on_demand'`: Standard on-demand only (`X-Vertex-AI-LLM-Request-Type: shared`).
-   `'flex_only'`: [Flex PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/flex-paygo) only (`X-Vertex-AI-LLM-Request-Type: shared` and `X-Vertex-AI-LLM-Shared-Request-Type: flex`).
-   `'priority_only'`: [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) only (`X-Vertex-AI-LLM-Request-Type: shared` and `X-Vertex-AI-LLM-Shared-Request-Type: priority`).

**Example**

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(location='global')
model = GoogleModel('gemini-3-flash-preview', provider=provider)
agent = Agent(model)

result = agent.run_sync(
    'Hello!',
    model_settings=GoogleModelSettings(google_vertex_service_tier='pt_then_flex'),
)
```

Swap `'pt_then_flex'` for any [`GoogleVertexServiceTier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleVertexServiceTier) value -- e.g. `'pt_then_priority'` for [Priority PayGo](https://cloud.google.com/vertex-ai/generative-ai/docs/priority-paygo) spillover, or `'flex_only'` / `'priority_only'` to bypass PT entirely.

The [`google_service_tier`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings.google_service_tier) field is deprecated in favor of these more specific fields.

After the request, inspect [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) `provider_details.get('traffic_type')` (e.g. `ON_DEMAND_FLEX`, `ON_DEMAND_PRIORITY`) to see which tier served it, when the API returns it.

#### Model Garden

You can access models from the [Model Garden](https://cloud.google.com/model-garden?hl=en) that support the `generateContent` API and are available under your GCP project, including but not limited to Gemini, using one of the following `model_name` patterns:

-   `{model_id}` for Gemini models
-   `{publisher}/{model_id}`
-   `publishers/{publisher}/models/{model_id}`
-   `projects/{project}/locations/{location}/publishers/{publisher}/models/{model_id}`

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(
    project='your-gcp-project-id',
    location='us-central1',  # the region where the model is available
)
model = GoogleModel('meta/llama-3.3-70b-instruct-maas', provider=provider)
agent = Agent(model)
...
```

## Custom HTTP Client

You can customize the `GoogleProvider` with a custom `httpx.AsyncClient`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

custom_http_client = AsyncClient(timeout=30)
model = GoogleModel(
    'gemini-3-pro-preview',
    provider=GoogleProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

## Document, Image, Audio, and Video Input

`GoogleModel` supports multi-modal input, including documents, images, audio, and video.

YouTube video URLs can be passed directly to Google models:

youtube\_input.py

```py
from pydantic_ai import Agent, VideoUrl
from pydantic_ai.models.google import GoogleModel

agent = Agent(GoogleModel('gemini-3-flash-preview'))
result = agent.run_sync(
    [
        'What is this video about?',
        VideoUrl(url='https://www.youtube.com/watch?v=dQw4w9WgXcQ'),
    ]
)
print(result.output)
```

Files can be uploaded via the [Files API](https://ai.google.dev/gemini-api/docs/files) and passed as URLs:

file\_upload.py

```py
from pydantic_ai import Agent, DocumentUrl
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider()
file = provider.client.files.upload(file='pydantic-ai-logo.png')
assert file.uri is not None

agent = Agent(GoogleModel('gemini-3-flash-preview', provider=provider))
result = agent.run_sync(
    [
        'What company is this logo from?',
        DocumentUrl(url=file.uri, media_type=file.mime_type),
    ]
)
print(result.output)
```

See the [input documentation](/docs/ai/advanced-features/input) for more details and examples.

## Model settings

You can customize model behavior using [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings):

```python
from google.genai.types import HarmBlockThreshold, HarmCategory

from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings

settings = GoogleModelSettings(
    temperature=0.2,
    max_tokens=1024,
    google_thinking_config={'thinking_level': 'low'},
    google_safety_settings=[
        {
            'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH,
            'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
        }
    ]
)
model = GoogleModel('gemini-3-pro-preview')
agent = Agent(model, model_settings=settings)
...
```

### Configure thinking

Gemini 3 models use `thinking_level` to control thinking behavior:

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings

# Set thinking level for Gemini 3 models
model_settings = GoogleModelSettings(google_thinking_config={'thinking_level': 'low'})  # 'low' or 'high'
model = GoogleModel('gemini-3-flash-preview')
agent = Agent(model, model_settings=model_settings)
...
```

For older models (pre-Gemini 3), you can use `thinking_budget` instead:

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings

# Disable thinking on older models by setting budget to 0
model_settings = GoogleModelSettings(google_thinking_config={'thinking_budget': 0})
model = GoogleModel('gemini-2.5-flash')  # Older model
agent = Agent(model, model_settings=model_settings)
...
```

Check out the [Gemini API docs](https://ai.google.dev/gemini-api/docs/thinking) for more on thinking.

### Safety settings

You can customize the safety settings by setting the `google_safety_settings` field.

```python
from google.genai.types import HarmBlockThreshold, HarmCategory

from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings

model_settings = GoogleModelSettings(
    google_safety_settings=[
        {
            'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH,
            'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
        }
    ]
)
model = GoogleModel('gemini-3-flash-preview')
agent = Agent(model, model_settings=model_settings)
...
```

See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings.

### Logprobs

You can return logprobs from the model in your response by setting `google_logprobs` and `google_top_logprobs` in the [`GoogleModelSettings`](/docs/ai/api/models/google/#pydantic_ai.models.google.GoogleModelSettings).

This feature is only supported for non-streaming requests and Vertex AI.

```python
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel, GoogleModelSettings
from pydantic_ai.providers.google import GoogleProvider

model_settings = GoogleModelSettings(
    google_logprobs=True, google_top_logprobs=2,
)

model = GoogleModel(
    model_name='gemini-2.5-flash',
    provider=GoogleProvider(location='europe-west1', vertexai=True),
)
agent = Agent(model, model_settings=model_settings)

result = agent.run_sync('Your prompt here')
# Access logprobs from provider_details
logprobs = result.response.provider_details.get('logprobs')
avg_logprobs = result.response.provider_details.get('avg_logprobs')
```

See the [Google Dev Blog](https://developers.googleblog.com/unlock-gemini-reasoning-with-logprobs-on-vertex-ai/) for more information.

---

# [Groq](https://pydantic.dev/docs/ai/models/groq/)

# Groq

## Install

To use `GroqModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `groq` optional group:

-   [pip](#tab-panel-72)
-   [uv](#tab-panel-73)

Terminal

```bash
pip install "pydantic-ai-slim[groq]"
```

Terminal

```bash
uv add "pydantic-ai-slim[groq]"
```

## Configuration

To use [Groq](https://groq.com/) through their API, go to [console.groq.com/keys](https://console.groq.com/keys) and follow your nose until you find the place to generate an API key.

`GroqModelName` contains a list of available Groq models.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export GROQ_API_KEY='your-api-key'
```

You can then use `GroqModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('groq:llama-3.3-70b-versatile')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel

model = GroqModel('llama-3.3-70b-versatile')
agent = Agent(model)
...
```

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
from pydantic_ai.providers.groq import GroqProvider

model = GroqModel(
    'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key')
)
agent = Agent(model)
...
```

You can also customize the `GroqProvider` with a custom `httpx.AsyncClient`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.groq import GroqModel
from pydantic_ai.providers.groq import GroqProvider

custom_http_client = AsyncClient(timeout=30)
model = GroqModel(
    'llama-3.3-70b-versatile',
    provider=GroqProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

---

# [Hugging Face](https://pydantic.dev/docs/ai/models/huggingface/)

# Hugging Face

[Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos. You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure.

Local embeddings via Sentence Transformers

This page covers chat completions via Hugging Face Inference Providers. To run Hugging Face **embedding** models locally (no API key, no network calls), see the [Sentence Transformers embedding model](/docs/ai/guides/embeddings#sentence-transformers-local), which works with any model in the [sentence-transformers library](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html).

## Install

To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group:

-   [pip](#tab-panel-76)
-   [uv](#tab-panel-77)

Terminal

```bash
pip install "pydantic-ai-slim[huggingface]"
```

Terminal

```bash
uv add "pydantic-ai-slim[huggingface]"
```

## Streaming cancellation

Cancellation limitations

The `huggingface_hub.AsyncInferenceClient` exposes streaming responses only as an async iterator, with no separate handle for closing the underlying HTTP transport. Because of a [Python language rule on async generators](https://peps.python.org/pep-0525/), [`cancel()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.cancel) cannot interrupt an in-flight chunk read while another coroutine is iterating the stream. Pydantic AI marks the response with `state='interrupted'`, but upstream generation may continue until the surrounding `async with agent.run_stream(...)` block exits.

For reliable cancellation, either pass `debounce_by=None` to [`stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text), [`stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output), or [`stream_responses()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) and call `cancel()` from the same task that's iterating:

cancel\_huggingface.py

```python
from pydantic_ai import Agent

agent = Agent('huggingface:Qwen/Qwen3-235B-A22B')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async for chunk in result.stream_text(debounce_by=None):
            if should_stop(chunk):
                await result.cancel()
                break
```

Or, if you need to keep debouncing, wrap the stream with [`contextlib.aclosing`](https://docs.python.org/3/library/contextlib.html#contextlib.aclosing) so the iterator is closed before `cancel()` runs:

cancel\_huggingface\_aclosing.py

```python
from contextlib import aclosing

from pydantic_ai import Agent

agent = Agent('huggingface:Qwen/Qwen3-235B-A22B')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async with aclosing(result.stream_text()) as stream:
            async for chunk in stream:
                if should_stop(chunk):
                    break
        await result.cancel()
```

Calling `cancel()` from a different task while iteration is in progress is not currently reliable on this provider.

## Configuration

To use [Hugging Face](https://huggingface.co/) inference, you'll need to set up an account which will give you [free tier](https://huggingface.co/docs/inference-providers/pricing) allowance on [Inference Providers](https://huggingface.co/docs/inference-providers). To setup inference, follow these steps:

1.  Go to [Hugging Face](https://huggingface.co/join) and sign up for an account.
2.  Create a new access token in [Hugging Face](https://huggingface.co/settings/tokens).
3.  Set the `HF_TOKEN` environment variable to the token you just created.

Once you have a Hugging Face access token, you can set it as an environment variable:

Terminal

```bash
export HF_TOKEN='hf_token'
```

## Usage

You can then use [`HuggingFaceModel`](/docs/ai/api/models/huggingface/#pydantic_ai.models.huggingface.HuggingFaceModel) by name:

```python
from pydantic_ai import Agent

agent = Agent('huggingface:Qwen/Qwen3-235B-A22B')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B')
agent = Agent(model)
...
```

By default, the [`HuggingFaceModel`](/docs/ai/api/models/huggingface/#pydantic_ai.models.huggingface.HuggingFaceModel) uses the [`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in [https://hf.co/settings/inference-providers](https://hf.co/settings/inference-providers).

## Configure the provider

If you want to pass parameters in code to the provider, you can programmatically instantiate the [`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) and pass it to the model:

```python
from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius'))
agent = Agent(model)
...
```

## Custom Hugging Face client

[`HuggingFaceProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.huggingface.HuggingFaceProvider) also accepts a custom [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client via the `hf_client` parameter, so you can customise the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the [Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client).

```python
from huggingface_hub import AsyncInferenceClient

from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

client = AsyncInferenceClient(
    bill_to='openai',
    api_key='hf_token',
    provider='fireworks-ai',
)

model = HuggingFaceModel(
    'Qwen/Qwen3-235B-A22B',
    provider=HuggingFaceProvider(hf_client=client),
)
agent = Agent(model)
...
```

---

# [Mistral](https://pydantic.dev/docs/ai/models/mistral/)

# Mistral

## Install

To use `MistralModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `mistral` optional group:

-   [pip](#tab-panel-74)
-   [uv](#tab-panel-75)

Terminal

```bash
pip install "pydantic-ai-slim[mistral]"
```

Terminal

```bash
uv add "pydantic-ai-slim[mistral]"
```

## Configuration

To use [Mistral](https://mistral.ai) through their API, go to [console.mistral.ai/api-keys/](https://console.mistral.ai/api-keys/) and follow your nose until you find the place to generate an API key.

`LatestMistralModelNames` contains a list of the most popular Mistral models.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export MISTRAL_API_KEY='your-api-key'
```

You can then use `MistralModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('mistral:mistral-large-latest')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel

model = MistralModel('mistral-small-latest')
agent = Agent(model)
...
```

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
from pydantic_ai.providers.mistral import MistralProvider

model = MistralModel(
    'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', base_url='https://<mistral-provider-endpoint>')
)
agent = Agent(model)
...
```

You can also customize the provider with a custom `httpx.AsyncClient`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.mistral import MistralModel
from pydantic_ai.providers.mistral import MistralProvider

custom_http_client = AsyncClient(timeout=30)
model = MistralModel(
    'mistral-large-latest',
    provider=MistralProvider(api_key='your-api-key', http_client=custom_http_client),
)
agent = Agent(model)
...
```

---

# [Ollama](https://pydantic.dev/docs/ai/models/ollama/)

# Ollama

## Install

To use [`OllamaModel`](/docs/ai/api/models/ollama/#pydantic_ai.models.ollama.OllamaModel), you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group:

-   [pip](#tab-panel-78)
-   [uv](#tab-panel-79)

Terminal

```bash
pip install "pydantic-ai-slim[openai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openai]"
```

## Configuration

Pydantic AI supports both self-hosted [Ollama](https://ollama.com/) servers (running locally or remotely) and [Ollama Cloud](https://ollama.com/cloud).

For servers running locally, use the `http://localhost:11434/v1` base URL. For Ollama Cloud, use `https://ollama.com/v1` and ensure an API key is set.

For backward compatibility, [`OllamaModel`](/docs/ai/api/models/ollama/#pydantic_ai.models.ollama.OllamaModel) uses Ollama's OpenAI-compatible Chat Completions API (`/v1/chat/completions`).

## Environment variable

Set the `OLLAMA_BASE_URL` and (optionally) `OLLAMA_API_KEY` environment variables:

Terminal

```bash
export OLLAMA_BASE_URL='http://localhost:11434/v1'
export OLLAMA_API_KEY='your-api-key'  # required for Ollama Cloud
```

You can then use `OllamaModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('ollama:qwen3')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel

model = OllamaModel('qwen3')
agent = Agent(model)
...
```

## `provider` argument

You can provide a custom `Provider` via the `provider` argument:

```python
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel
from pydantic_ai.providers.ollama import OllamaProvider

model = OllamaModel(
    'qwen3', provider=OllamaProvider(base_url='http://localhost:11434/v1')
)
agent = Agent(model)
...
```

For Ollama Cloud, use `base_url='https://ollama.com/v1'` and set the `OLLAMA_API_KEY` environment variable (or pass `api_key=` directly).

## Structured output

Self-hosted Ollama (v0.5.0+, released December 2024) enforces `response_format` with `json_schema` via `llama.cpp`'s grammar-constrained decoder, so [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) produces schema-valid output at generation time:

```python
from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel
from pydantic_ai.output import NativeOutput
from pydantic_ai.providers.ollama import OllamaProvider


class CityLocation(BaseModel):
    city: str
    country: str


model = OllamaModel(
    'qwen3',
    provider=OllamaProvider(base_url='http://localhost:11434/v1'),
)
agent = Agent(model, output_type=NativeOutput(CityLocation))
...
```

Ollama Cloud does not enforce `json_schema` yet

Ollama Cloud's inference backend accepts `response_format` with `json_schema` without error but does not apply grammar-constrained decoding, so schemas are silently not enforced. See [ollama/ollama#12362](https://github.com/ollama/ollama/issues/12362) for the upstream tracking issue.

When [`OllamaModel`](/docs/ai/api/models/ollama/#pydantic_ai.models.ollama.OllamaModel) detects a Cloud path -- either a `base_url` on `ollama.com` or a model name ending in `-cloud` -- it automatically disables `supports_json_schema_output` on the profile.

If you use [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) with an Ollama Cloud model, you'll get a clear [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) instead of a silent retry loop. Use the default [`ToolOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.ToolOutput) or [`PromptedOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.PromptedOutput) instead -- both work on Cloud.

---

# [OpenAI](https://pydantic.dev/docs/ai/models/openai/)

# OpenAI

## Install

To use OpenAI models or OpenAI-compatible APIs, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group:

-   [pip](#tab-panel-58)
-   [uv](#tab-panel-59)

Terminal

```bash
pip install "pydantic-ai-slim[openai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openai]"
```

## Configuration

To use `OpenAIChatModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export OPENAI_API_KEY='your-api-key'
```

You can then use `OpenAIChatModel` by name:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel

model = OpenAIChatModel('gpt-5.2')
agent = Agent(model)
...
```

By default, the `OpenAIChatModel` uses the `OpenAIProvider` with the `base_url` set to `https://api.openai.com/v1`.

## Configure the provider

If you want to pass parameters in code to the provider, you can programmatically instantiate the [OpenAIProvider](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openai.OpenAIProvider) and pass it to the model:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(api_key='your-api-key'))
agent = Agent(model)
...
```

## Custom OpenAI Client

`OpenAIProvider` also accepts a custom `AsyncOpenAI` client via the `openai_client` parameter, so you can customise the `organization`, `project`, `base_url` etc. as defined in the [OpenAI API docs](https://platform.openai.com/docs/api-reference).

custom\_openai\_client.py

```python
from openai import AsyncOpenAI

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

client = AsyncOpenAI(max_retries=3)
model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(openai_client=client))
agent = Agent(model)
...
```

You could also use the [`AsyncAzureOpenAI`](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) client to use the Azure OpenAI API. Note that the `AsyncAzureOpenAI` is a subclass of `AsyncOpenAI`.

```python
from openai import AsyncAzureOpenAI

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

client = AsyncAzureOpenAI(
    azure_endpoint='...',
    api_version='2024-07-01-preview',
    api_key='your-api-key',
)

model = OpenAIChatModel(
    'gpt-5.2',
    provider=OpenAIProvider(openai_client=client),
)
agent = Agent(model)
...
```

## Model settings

You can customize model behavior using [`OpenAIChatModelSettings`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModelSettings):

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel, OpenAIChatModelSettings

model = OpenAIChatModel('gpt-5.2')
settings = OpenAIChatModelSettings(
    temperature=0.2,
    service_tier='flex',
)
agent = Agent(model, model_settings=settings)
...
```

### Service tier

OpenAI supports controlling the [service tier](https://platform.openai.com/docs/api-reference/chat/create#chat-create-service_tier) to trade off latency and cost. You can use the unified [`service_tier`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings.service_tier) field or the provider-specific [`openai_service_tier`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModelSettings.openai_service_tier) field. Both accept `'auto'`, `'default'`, `'flex'`, and `'priority'`, passed through unchanged. `openai_service_tier` takes precedence over the unified field when both are set.

## OpenAI Responses API

Pydantic AI also supports OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) through [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel):

```python
from pydantic_ai import Agent

agent = Agent('openai-responses:gpt-5.2')
...
```

Or initialise the model directly with just the model name:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel

model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model)
...
```

You can learn more about the differences between the Responses API and Chat Completions API in the [OpenAI API docs](https://platform.openai.com/docs/guides/migrate-to-responses).

### Built-in tools

The Responses API has built-in tools that you can use instead of building your own:

-   [Web search](https://platform.openai.com/docs/guides/tools-web-search): allow models to search the web for the latest information before generating a response.
-   [Code interpreter](https://platform.openai.com/docs/guides/tools-code-interpreter): allow models to write and run Python code in a sandboxed environment before generating a response.
-   [Image generation](https://platform.openai.com/docs/guides/tools-image-generation): allow models to generate images based on a text prompt.
-   [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response.
-   [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf.

Web search, Code interpreter, Image generation, and File search are natively supported through the [Built-in tools](/docs/ai/tools-toolsets/builtin-tools) feature.

Computer use can be enabled by passing an [`openai.types.responses.ComputerToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/computer_tool_param.py) in the `openai_builtin_tools` setting on [`OpenAIResponsesModelSettings`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings). It doesn't currently generate [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) or [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) parts in the message history, or streamed events; please submit an issue if you need native support for this built-in tool.

computer\_use\_tool.py

```python
from openai.types.responses import ComputerToolParam

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model_settings = OpenAIResponsesModelSettings(
    openai_builtin_tools=[
        ComputerToolParam(
            type='computer_use',
        )
    ],
)
model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model=model, model_settings=model_settings)

result = agent.run_sync('Open a new browser tab')
print(result.output)
```

#### Referencing earlier responses

The Responses API supports referencing earlier model responses in a new request using a `previous_response_id` parameter, to ensure the full [conversation state](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#passing-context-from-the-previous-response) including [reasoning items](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) is kept in context without having to resend it. This is available through the [`openai_previous_response_id`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_previous_response_id) field in [`OpenAIResponsesModelSettings`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings).

When the field is set to `'auto'`, Pydantic AI automatically selects the most recent `provider_response_id` from the message history and omits messages that came before it, letting the OpenAI API reconstruct them from server-side state. The same chaining is applied inside a run across tool-call continuations and retries, so OpenAI never sees duplicate copies of the same messages.

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model=model)

result1 = agent.run_sync('Tell me a joke.')
print(result1.output)
#> Did you hear about the toothpaste scandal? They called it Colgate.

model_settings = OpenAIResponsesModelSettings(openai_previous_response_id='auto')
result2 = agent.run_sync(
    'Explain?',
    message_history=result1.new_messages(),
    model_settings=model_settings
)
print(result2.output)
#> This is an excellent joke invented by Samuel Colvin, it needs no explanation.
```

As an alternative to passing `message_history`, you can pass a concrete `provider_response_id` from an earlier run as the seed. Pydantic AI uses the seed for the first request in the new run, then automatically chains to the response returned for that request on any subsequent in-run calls -- so the chain still extends correctly if the run includes tool-call continuations or retries.

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model=model)

result = agent.run_sync('The secret is 1234')
model_settings = OpenAIResponsesModelSettings(
    openai_previous_response_id=result.all_messages()[-1].provider_response_id
)
result = agent.run_sync('What is the secret code?', model_settings=model_settings)
print(result.output)
#> 1234
```

Note

Referencing a stored response requires the response to have actually been stored. OpenAI stores responses by default; if you've disabled storage via `openai_store=False` or your organization has Zero Data Retention enabled, chaining is unavailable and the full message history must be sent on every request.

#### Using durable conversations

OpenAI's [Conversations API](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#using-the-conversations-api) works with the Responses API to persist conversation state in a durable conversation object. If you already have an OpenAI conversation ID, pass it with [`openai_conversation_id`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_conversation_id):

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model=model)

model_settings = OpenAIResponsesModelSettings(openai_conversation_id='conv_...')
result = agent.run_sync('What did we discuss last time?', model_settings=model_settings)
print(result.output)
```

When a response belongs to a conversation, Pydantic AI stores the returned ID in `ModelResponse.provider_details['conversation_id']`. Setting `openai_conversation_id='auto'` uses the most recent same-provider conversation ID from the message history and sends only the new input items after that response.

When message-level [`conversation_id`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.conversation_id) values are available, `auto` only reuses an OpenAI conversation from the current Pydantic AI conversation; pass a concrete OpenAI conversation ID to reuse one explicitly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

model = OpenAIResponsesModel('gpt-5.2')
agent = Agent(model=model)

model_settings = OpenAIResponsesModelSettings(openai_conversation_id='conv_...')
result = agent.run_sync('What did we discuss last time?', model_settings=model_settings)

follow_up_settings = OpenAIResponsesModelSettings(openai_conversation_id='auto')
result2 = agent.run_sync(
    'Summarize the next step.',
    message_history=result.new_messages(),
    model_settings=follow_up_settings,
)
print(result2.output)
```

Pydantic AI does not create OpenAI conversations for you. Use the OpenAI client to create the conversation, then pass its ID to `openai_conversation_id`. The `conversation` and `previous_response_id` parameters are mutually exclusive in the OpenAI API, so `openai_conversation_id` cannot be combined with [`openai_previous_response_id`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_previous_response_id).

#### Message Compaction

The Responses API supports [compacting message history](https://developers.openai.com/api/docs/guides/compaction) to reduce token usage in long conversations. Compaction produces an encrypted summary that replaces older messages while preserving context.

The easiest way to enable compaction is with the [`OpenAICompaction`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction) capability:

openai\_compaction.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAICompaction

agent = Agent(
    'openai-responses:gpt-5.2',
    capabilities=[OpenAICompaction()],
)
```

By default, `OpenAICompaction` runs in **stateful mode**: it configures OpenAI's server-side auto-compaction via the `context_management` field on the regular `/responses` request, and OpenAI triggers compaction whenever the input token count crosses a threshold it manages for you. This mode is compatible with [`openai_previous_response_id='auto'`](#referencing-earlier-responses) and [`openai_conversation_id`](#using-durable-conversations).

To override the threshold, pass [`token_threshold`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction):

openai\_compaction\_token\_threshold.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAICompaction

agent = Agent(
    'openai-responses:gpt-5.2',
    capabilities=[OpenAICompaction(token_threshold=100_000)],
)
```

As an alternative, `OpenAICompaction` supports a **stateless mode** (`stateless=True`) that calls the stateless `/responses/compact` endpoint via a `before_model_request` hook. Use this in [ZDR](https://openai.com/enterprise-privacy/) environments where OpenAI must not retain conversation data, when using `openai_store=False`, or when you need explicit out-of-band control over when compaction runs. Stateless mode requires you to specify either a [`message_count_threshold`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAICompaction) or a custom `trigger` callable:

openai\_compaction\_stateless.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAICompaction

agent = Agent(
    'openai-responses:gpt-5.2',
    capabilities=[OpenAICompaction(message_count_threshold=20)],
)
```

The mode is inferred from which parameters you pass: supplying `message_count_threshold` or `trigger` implies stateless mode, otherwise stateful mode is used. You can also pass `stateless=True` or `stateless=False` explicitly. Mixing parameters from different modes raises [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError).

Tip

Stateful compaction pairs especially well with [`openai_previous_response_id='auto'`](#referencing-earlier-responses) or [`openai_conversation_id`](#using-durable-conversations). Both rely on OpenAI's server-side conversation state, so OpenAI can use a previously compacted context as the starting point for the next turn without you having to resend it.

For lower-level use cases, you can call [`compact_messages`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel.compact_messages) directly on the model.

## OpenAI-compatible Models

Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. Before getting started, check the [installation and configuration](#install) instructions above.

To use another OpenAI-compatible API, you can set the `OPENAI_BASE_URL` and `OPENAI_API_KEY` environment variables, or make use of the `base_url` and `api_key` arguments from [`OpenAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openai.OpenAIProvider):

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIChatModel(
    'model_name',
    provider=OpenAIProvider(
        base_url='https://<openai-compatible-api-endpoint>', api_key='your-api-key'
    ),
)
agent = Agent(model)
...
```

Various providers also have their own provider classes so that you don't need to specify the base URL yourself and you can use the standard `<PROVIDER>_API_KEY` environment variable to set the API key. When a provider has its own provider class, you can use the `Agent("<provider>:<model>")` shorthand, e.g. `Agent("deepseek:deepseek-chat")` or `Agent("moonshotai:kimi-k2-0711-preview")`, instead of building the `OpenAIChatModel` explicitly. Similarly, you can pass the provider name as a string to the `provider` argument on `OpenAIChatModel` instead of instantiating the provider class explicitly.

### Model Profile

Sometimes, the provider or model you're using will have slightly different requirements than OpenAI's API or models, like having different restrictions on JSON schemas for tool definitions, or not supporting tool definitions to be marked as strict.

When using an alternative provider class provided by Pydantic AI, an appropriate model profile is typically selected automatically based on the model name. If the model you're using is not working correctly out of the box, you can tweak various aspects of how model requests are constructed by providing your own [`ModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.ModelProfile) (for behaviors shared among all model classes) or [`OpenAIModelProfile`](/docs/ai/api/pydantic-ai/profiles/#pydantic_ai.profiles.openai.OpenAIModelProfile) (for behaviors specific to `OpenAIChatModel`):

```py
from pydantic_ai import Agent, InlineDefsJsonSchemaTransformer
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.profiles.openai import OpenAIModelProfile
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIChatModel(
    'model_name',
    provider=OpenAIProvider(
        base_url='https://<openai-compatible-api-endpoint>.com', api_key='your-api-key'
    ),
    profile=OpenAIModelProfile(
        json_schema_transformer=InlineDefsJsonSchemaTransformer,  # Supported by any model class on a plain ModelProfile
        openai_supports_strict_tool_definition=False  # Supported by OpenAIModel only, requires OpenAIModelProfile
    )
)
agent = Agent(model)
```

### DeepSeek

To use the [DeepSeek](https://deepseek.com) provider, first create an API key by following the [Quick Start guide](https://api-docs.deepseek.com/).

You can then set the `DEEPSEEK_API_KEY` environment variable and use [`DeepSeekProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.deepseek.DeepSeekProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('deepseek:deepseek-chat')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.deepseek import DeepSeekProvider

model = OpenAIChatModel(
    'deepseek-chat',
    provider=DeepSeekProvider(api_key='your-deepseek-api-key'),
)
agent = Agent(model)
...
```

You can also customize any provider with a custom `http_client`:

```python
from httpx import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.deepseek import DeepSeekProvider

custom_http_client = AsyncClient(timeout=30)
model = OpenAIChatModel(
    'deepseek-chat',
    provider=DeepSeekProvider(
        api_key='your-deepseek-api-key', http_client=custom_http_client
    ),
)
agent = Agent(model)
...
```

### Alibaba Cloud Model Studio (DashScope)

To use Qwen models via [Alibaba Cloud Model Studio (DashScope)](https://www.alibabacloud.com/en/product/modelstudio), you can set the `ALIBABA_API_KEY` (or `DASHSCOPE_API_KEY`) environment variable and use [`AlibabaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.alibaba.AlibabaProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('alibaba:qwen-max')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.alibaba import AlibabaProvider

model = OpenAIChatModel(
    'qwen-max',
    provider=AlibabaProvider(api_key='your-api-key'),
)
agent = Agent(model)
...
```

The `AlibabaProvider` uses the international DashScope compatible endpoint `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` by default. You can override this by passing a custom `base_url`:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.alibaba import AlibabaProvider

model = OpenAIChatModel(
    'qwen-max',
    provider=AlibabaProvider(
        api_key='your-api-key',
        base_url='https://dashscope.aliyuncs.com/compatible-mode/v1',  # China region
    ),
)
agent = Agent(model)
...
```

### Ollama

See [Ollama](/docs/ai/models/ollama) for dedicated Ollama documentation, including structured output and Ollama Cloud limitations.

### Azure AI Foundry

To use [Azure AI Foundry](https://ai.azure.com/) as your provider, set `AZURE_OPENAI_ENDPOINT` to a URL whose path ends in `/v1` (for example `https://<resource>.openai.azure.com/openai/v1/` or `https://<resource>.services.ai.azure.com/openai/v1/`), set `AZURE_OPENAI_API_KEY`, and use [`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('azure:gpt-5.2')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider

model = OpenAIChatModel(
    'gpt-5.2',
    provider=AzureProvider(
        azure_endpoint='https://your-resource.openai.azure.com/openai/v1/',
        api_key='your-api-key',
    ),
)
agent = Agent(model)
...
```

This targets the [Azure OpenAI v1 API](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/api-version-lifecycle), which Microsoft recommends for all new projects. It also pairs naturally with the Responses API -- see [Using Azure with the Responses API](#using-azure-with-the-responses-api) below.

[`AzureProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.azure.AzureProvider) also recognises [Azure AI Foundry serverless model deployments](https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/concepts/endpoints) at `https://<model>.<region>.models.ai.azure.com` and connects to them the same way.

#### Connecting to an existing `api-version`\-based deployment

If your resource still uses the dated `api-version` API, pass `api_version` (or set the `OPENAI_API_VERSION` environment variable) and point `azure_endpoint` at the resource root instead:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.azure import AzureProvider

model = OpenAIChatModel(
    'gpt-5.2',
    provider=AzureProvider(
        azure_endpoint='https://your-resource.openai.azure.com/',
        api_version='2024-12-01-preview',
        api_key='your-api-key',
    ),
)
agent = Agent(model)
...
```

#### Using Azure with the Responses API

Azure AI Foundry also supports the OpenAI Responses API through [`OpenAIResponsesModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModel). This is particularly recommended when working with document inputs ([`DocumentUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.DocumentUrl) and [`BinaryContent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryContent)), as Azure's Chat Completions API does not support these input types.

Document processing with Azure using Responses API

```python
from pydantic_ai import Agent, BinaryContent
from pydantic_ai.models.openai import OpenAIResponsesModel
from pydantic_ai.providers.azure import AzureProvider

pdf_bytes = b'%PDF-1.4 ...'  # Your PDF content

model = OpenAIResponsesModel(
    'gpt-5.2',
    provider=AzureProvider(
        azure_endpoint='https://your-resource.openai.azure.com/openai/v1/',
        api_key='your-api-key',
    ),
)
agent = Agent(model)
result = agent.run_sync([
    'Summarize this document',
    BinaryContent(data=pdf_bytes, media_type='application/pdf'),
])
```

### Vercel AI Gateway

To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token.

You can set the `VERCEL_AI_GATEWAY_API_KEY` and `VERCEL_OIDC_TOKEN` environment variables and use [`VercelProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.vercel.VercelProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('vercel:anthropic/claude-sonnet-4-5')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.vercel import VercelProvider

model = OpenAIChatModel(
    'anthropic/claude-sonnet-4-5',
    provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'),
)
agent = Agent(model)
...
```

### MoonshotAI

Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console).

You can set the `MOONSHOTAI_API_KEY` environment variable and use [`MoonshotAIProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.moonshotai.MoonshotAIProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('moonshotai:kimi-k2-0711-preview')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.moonshotai import MoonshotAIProvider

model = OpenAIChatModel(
    'kimi-k2-0711-preview',
    provider=MoonshotAIProvider(api_key='your-moonshot-api-key'),
)
agent = Agent(model)
...
```

### GitHub Models

To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission.

You can set the `GITHUB_API_KEY` environment variable and use [`GitHubProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.github.GitHubProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('github:xai/grok-3-mini')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.github import GitHubProvider

model = OpenAIChatModel(
    'xai/grok-3-mini',  # GitHub Models uses prefixed model names
    provider=GitHubProvider(api_key='your-github-token'),
)
agent = Agent(model)
...
```

GitHub Models supports various model families with different prefixes. You can see the full list on the [GitHub Marketplace](https://github.com/marketplace?type=models) or the public [catalog endpoint](https://models.github.ai/catalog/models).

### Perplexity

Follow the Perplexity [getting started](https://docs.perplexity.ai/guides/getting-started) guide to create an API key, then initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

model = OpenAIChatModel(
    'sonar-pro',
    provider=OpenAIProvider(
        base_url='https://api.perplexity.ai',
        api_key='your-perplexity-api-key',
    ),
)
agent = Agent(model)
...
```

### Fireworks AI

Go to [Fireworks.AI](https://fireworks.ai/) and create an API key in your account settings.

You can set the `FIREWORKS_API_KEY` environment variable and use [`FireworksProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.fireworks.FireworksProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('fireworks:accounts/fireworks/models/qwq-32b')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.fireworks import FireworksProvider

model = OpenAIChatModel(
    'accounts/fireworks/models/qwq-32b',  # model library available at https://fireworks.ai/models
    provider=FireworksProvider(api_key='your-fireworks-api-key'),
)
agent = Agent(model)
...
```

### Together AI

Go to [Together.ai](https://www.together.ai/) and create an API key in your account settings.

You can set the `TOGETHER_API_KEY` environment variable and use [`TogetherProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.together.TogetherProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('together:meta-llama/Llama-3.3-70B-Instruct-Turbo-Free')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.together import TogetherProvider

model = OpenAIChatModel(
    'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free',  # model library available at https://www.together.ai/models
    provider=TogetherProvider(api_key='your-together-api-key'),
)
agent = Agent(model)
...
```

### Heroku AI

To use [Heroku AI](https://www.heroku.com/ai), first create an API key.

You can set the `HEROKU_INFERENCE_KEY` and (optionally) `HEROKU_INFERENCE_URL` environment variables and use [`HerokuProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.heroku.HerokuProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('heroku:claude-sonnet-4-5')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.heroku import HerokuProvider

model = OpenAIChatModel(
    'claude-sonnet-4-5',
    provider=HerokuProvider(api_key='your-heroku-inference-key'),
)
agent = Agent(model)
...
```

### LiteLLM

To use [LiteLLM](https://www.litellm.ai/), set the configs as outlined in the [doc](https://docs.litellm.ai/docs/set_keys). In `LiteLLMProvider`, you can pass `api_base` and `api_key`. The value of these configs will depend on your setup. For example, if you are using OpenAI models, then you need to pass `https://api.openai.com/v1` as the `api_base` and your OpenAI API key as the `api_key`. If you are using a LiteLLM proxy server running on your local machine, then you need to pass `http://localhost:<port>` as the `api_base` and your LiteLLM API key (or a placeholder) as the `api_key`.

To use custom LLMs, use `custom/` prefix in the model name.

Once you have the configs, use the [`LiteLLMProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.litellm.LiteLLMProvider) as follows:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.litellm import LiteLLMProvider

model = OpenAIChatModel(
    'openai/gpt-5.2',
    provider=LiteLLMProvider(
        api_base='<api-base-url>',
        api_key='<api-key>'
    )
)
agent = Agent(model)

result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
...
```

### Nebius AI Studio

Go to [Nebius AI Studio](https://studio.nebius.com/) and create an API key.

You can set the `NEBIUS_API_KEY` environment variable and use [`NebiusProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.nebius.NebiusProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('nebius:Qwen/Qwen3-32B-fast')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.nebius import NebiusProvider

model = OpenAIChatModel(
    'Qwen/Qwen3-32B-fast',
    provider=NebiusProvider(api_key='your-nebius-api-key'),
)
agent = Agent(model)
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

### OVHcloud AI Endpoints

To use OVHcloud AI Endpoints, you need to create a new API key. To do so, go to the [OVHcloud manager](https://ovh.com/manager), then in Public Cloud > AI Endpoints > API keys. Click on `Create a new API key` and copy your new key.

You can explore the [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to find which models are available.

You can set the `OVHCLOUD_API_KEY` environment variable and use [`OVHcloudProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ovhcloud.OVHcloudProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('ovhcloud:gpt-oss-120b')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

If you need to configure the provider, you can use the [`OVHcloudProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.ovhcloud.OVHcloudProvider) class:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.ovhcloud import OVHcloudProvider

model = OpenAIChatModel(
    'gpt-oss-120b',
    provider=OVHcloudProvider(api_key='your-api-key'),
)
agent = Agent(model)
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

### SambaNova

To use [SambaNova Cloud](https://cloud.sambanova.ai/), you need to obtain an API key from the [SambaNova Cloud dashboard](https://cloud.sambanova.ai/dashboard).

SambaNova provides access to multiple model families including Meta Llama, DeepSeek, Qwen, and Mistral models with fast inference speeds.

You can set the `SAMBANOVA_API_KEY` environment variable and use [`SambaNovaProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.sambanova.SambaNovaProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('sambanova:Meta-Llama-3.1-8B-Instruct')
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.sambanova import SambaNovaProvider

model = OpenAIChatModel(
    'Meta-Llama-3.1-8B-Instruct',
    provider=SambaNovaProvider(api_key='your-api-key'),
)
agent = Agent(model)
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

For a complete list of available models, see the [SambaNova supported models documentation](https://docs.sambanova.ai/docs/en/models/sambacloud-models).

You can customize the base URL if needed:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.sambanova import SambaNovaProvider

model = OpenAIChatModel(
    'DeepSeek-R1-0528',
    provider=SambaNovaProvider(
        api_key='your-api-key',
        base_url='https://custom.endpoint.com/v1',
    ),
)
agent = Agent(model)
...
```

---

# [OpenRouter](https://pydantic.dev/docs/ai/models/openrouter/)

# OpenRouter

## Install

To use `OpenRouterModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openrouter` optional group:

-   [pip](#tab-panel-80)
-   [uv](#tab-panel-81)

Terminal

```bash
pip install "pydantic-ai-slim[openrouter]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openrouter]"
```

## Configuration

To use [OpenRouter](https://openrouter.ai), first create an API key at [openrouter.ai/keys](https://openrouter.ai/keys).

You can set the `OPENROUTER_API_KEY` environment variable and use [`OpenRouterProvider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.openrouter.OpenRouterProvider) by name:

```python
from pydantic_ai import Agent

agent = Agent('openrouter:anthropic/claude-sonnet-4-5')
...
```

Or initialise the model and provider directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openrouter import OpenRouterModel
from pydantic_ai.providers.openrouter import OpenRouterProvider

model = OpenRouterModel(
    'anthropic/claude-sonnet-4-5',
    provider=OpenRouterProvider(api_key='your-openrouter-api-key'),
)
agent = Agent(model)
...
```

## App Attribution

OpenRouter has an [app attribution](https://openrouter.ai/docs/app-attribution) feature to track your application in their public ranking and analytics.

You can pass in an `app_url` and `app_title` when initializing the provider to enable app attribution.

```python
from pydantic_ai.providers.openrouter import OpenRouterProvider

provider=OpenRouterProvider(
    api_key='your-openrouter-api-key',
    app_url='https://your-app.com',
    app_title='Your App',
),
...
```

## Model Settings

You can customize model behavior using [`OpenRouterModelSettings`](/docs/ai/api/models/openrouter/#pydantic_ai.models.openrouter.OpenRouterModelSettings):

```python
from pydantic_ai import Agent
from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings

settings = OpenRouterModelSettings(
    openrouter_reasoning={
        'effort': 'high',
    },
    openrouter_usage={
        'include': True,
    }
)
model = OpenRouterModel('openai/gpt-5.2')
agent = Agent(model, model_settings=settings)
...
```

## Web Search

OpenRouter supports web search via its [plugins](https://openrouter.ai/docs/guides/features/plugins/web-search). You can enable it using the [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool).

### Web Search Parameters

You can customize the web search behavior using the `search_context_size` parameter on [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool):

```python
from pydantic_ai import Agent
from pydantic_ai.builtin_tools import WebSearchTool
from pydantic_ai.models.openrouter import OpenRouterModel

tool = WebSearchTool(search_context_size='high')
model = OpenRouterModel('openai/gpt-4.1')
agent = Agent(
    model,
    builtin_tools=[tool]
)
result = agent.run_sync('What is the latest news in AI?')
```

---

# [Outlines](https://pydantic.dev/docs/ai/models/outlines/)

# Outlines

## Install

As Outlines is a library allowing you to run models from various different providers, it does not include the necessary dependencies for any provider by default. As a result, to use the [`OutlinesModel`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel), you must install `pydantic-ai-slim` with an optional group composed of outlines, a dash, and the name of the specific model provider you would use through Outlines. For instance:

-   [pip](#tab-panel-82)
-   [uv](#tab-panel-83)

Terminal

```bash
pip install "pydantic-ai-slim[outlines-transformers]"
```

Terminal

```bash
uv add "pydantic-ai-slim[outlines-transformers]"
```

Or

-   [pip](#tab-panel-84)
-   [uv](#tab-panel-85)

Terminal

```bash
pip install "pydantic-ai-slim[outlines-mlxlm]"
```

Terminal

```bash
uv add "pydantic-ai-slim[outlines-mlxlm]"
```

There are 5 optional groups for the 5 model providers supported through Outlines:

-   `outlines-transformers`
-   `outlines-llamacpp`
-   `outlines-mlxlm`
-   `outlines-sglang`
-   `outlines-vllm-offline`

## Model Initialization

As Outlines is not an inference provider, but instead a library allowing you to run both local and API-based models, instantiating the model is a bit different from the other models available on Pydantic AI.

To initialize the `OutlinesModel` through the `__init__` method, the first argument you must provide has to be an `outlines.Model` or an `outlines.AsyncModel` instance.

For instance:

```python
import outlines
from transformers import AutoModelForCausalLM, AutoTokenizer

from pydantic_ai.models.outlines import OutlinesModel

outlines_model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained('erwanf/gpt2-mini'),
    AutoTokenizer.from_pretrained('erwanf/gpt2-mini')
)
model = OutlinesModel(outlines_model)
```

As you already providing an Outlines model instance, there is no need to provide an `OutlinesProvider` yourself.

### Model Loading Methods

Alternatively, you can use some `OutlinesModel` class methods made to load a specific type of Outlines model directly. To do so, you must provide as arguments the same arguments you would have given to the associated Outlines model loading function (except in the case of SGLang).

There are methods for the 5 Outlines models that are officially supported in the integration into Pydantic AI:

-   [`from_transformers`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_transformers)
-   [`from_llamacpp`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_llamacpp)
-   [`from_mlxlm`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_mlxlm)
-   [`from_sglang`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_sglang)
-   [`from_vllm_offline`](/docs/ai/api/models/outlines/#pydantic_ai.models.outlines.OutlinesModel.from_vllm_offline)

#### Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

from pydantic_ai.models.outlines import OutlinesModel

model = OutlinesModel.from_transformers(
    AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'),
    AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct')
)
```

#### LlamaCpp

```python
from llama_cpp import Llama

from pydantic_ai.models.outlines import OutlinesModel

model = OutlinesModel.from_llamacpp(
    Llama.from_pretrained(
        repo_id='TheBloke/Mistral-7B-Instruct-v0.2-GGUF',
        filename='mistral-7b-instruct-v0.2.Q5_K_M.gguf',
    )
)
```

#### MLXLM

```python
from mlx_lm import load

from pydantic_ai.models.outlines import OutlinesModel

model = OutlinesModel.from_mlxlm(
    *load('mlx-community/TinyLlama-1.1B-Chat-v1.0-4bit')
)
```

#### SGLang

```python
from pydantic_ai.models.outlines import OutlinesModel

model = OutlinesModel.from_sglang(
    'http://localhost:11434',
    'api_key',
    'meta-llama/Llama-3.1-8B'
)
```

#### vLLM Offline

```python
from vllm import LLM

from pydantic_ai.models.outlines import OutlinesModel

model = OutlinesModel.from_vllm_offline(
    LLM('microsoft/Phi-3-mini-4k-instruct')
)
```

## Running the model

Once you have initialized an `OutlinesModel`, you can use it with an Agent as with all other Pydantic AI models.

As Outlines is focused on structured output, this provider supports the `output_type` component through the [`NativeOutput`](/docs/ai/api/pydantic-ai/output/#pydantic_ai.output.NativeOutput) format. There is not need to include information on the required output format in your prompt, instructions based on the `output_type` will be included automatically.

```python
from pydantic import BaseModel
from transformers import AutoModelForCausalLM, AutoTokenizer

from pydantic_ai import Agent, ModelSettings
from pydantic_ai.models.outlines import OutlinesModel


class Box(BaseModel):
    """Class representing a box"""
    width: int
    height: int
    depth: int
    units: str

model = OutlinesModel.from_transformers(
    AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'),
    AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct')
)
agent = Agent(model, output_type=Box)

result = agent.run_sync(
    'Give me the dimensions of a box',
    model_settings=ModelSettings(extra_body={'max_new_tokens': 100})
)
print(result.output) # width=20 height=30 depth=40 units='cm'
```

Outlines does not support tools yet, but support for that feature will be added in the near future.

## Multimodal models

If the model you are running through Outlines and the provider selected supports it, you can include images in your prompts using [`ImageUrl`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ImageUrl) or [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage). In that case, the prompt you provide when running the agent should be a list containing a string and one or several images. See the [input documentation](/docs/ai/advanced-features/input) for details and examples on using assets in model inputs.

This feature is supported in Outlines for the `SGLang` and `Transformers` models. If you want to run a multimodal model through `transformers`, you must provide a processor instead of a tokenizer as the second argument when initializing the model with the `OutlinesModel.from_transformers` method.

```python
from datetime import date
from typing import Literal

import torch
from pydantic import BaseModel
from transformers import AutoProcessor, Qwen2VLForConditionalGeneration

from pydantic_ai import Agent, ModelSettings
from pydantic_ai.messages import ImageUrl
from pydantic_ai.models.outlines import OutlinesModel

MODEL_NAME = 'Qwen/Qwen2-VL-7B-Instruct'

class Item(BaseModel):
    name: str
    quantity: int | None
    price_per_unit: float | None
    total_price: float | None

class ReceiptSummary(BaseModel):
    store_name: str
    store_address: str
    store_number: int | None
    items: list[Item]
    tax: float | None
    total: float | None
    date: date
    payment_method: Literal['cash', 'credit', 'debit', 'check', 'other']

tf_model = Qwen2VLForConditionalGeneration.from_pretrained(
    MODEL_NAME,
    device_map='auto',
    dtype=torch.bfloat16
)
tf_processor = AutoProcessor.from_pretrained(
    MODEL_NAME,
    device_map='auto'
)
model = OutlinesModel.from_transformers(tf_model, tf_processor)

agent = Agent(model, output_type=ReceiptSummary)

result = agent.run_sync(
    [
        'You are an expert at extracting information from receipts. Please extract the information from the receipt. Be as detailed as possible, do not miss any information',
        ImageUrl('https://raw.githubusercontent.com/dottxt-ai/outlines/refs/heads/main/docs/examples/images/trader-joes-receipt.jpg')
    ],
    model_settings=ModelSettings(extra_body={'max_new_tokens': 1000})
)
print(result.output)
# store_name="Trader Joe's"
# store_address='401 Bay Street, San Francisco, CA 94133'
# store_number=0
# items=[
#   Item(name='BANANA EACH', quantity=7, price_per_unit=0.23, total_price=1.61),
#   Item(name='BAREBELLS CHOCOLATE DOUG',quantity=1, price_per_unit=2.29, total_price=2.29),
#   Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29),
#   Item(name='BAREBELLS CHOCOLATE DOUG', quantity=1, price_per_unit=2.29, total_price=2.29),
#   Item(name='BAREBELLS CARAMEL CASHEW', quantity=2, price_per_unit=2.29, total_price=4.58),
#   Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29),
#   Item(name='T SPINDRIFT ORANGE MANGO 8', quantity=1, price_per_unit=7.49, total_price=7.49),
#   Item(name='T Bottle Deposit', quantity=8, price_per_unit=0.05, total_price=0.4),
#   Item(name='MILK ORGANIC GALLON WHOL', quantity=1, price_per_unit=6.79, total_price=6.79),
#   Item(name='CLASSIC GREEK SALAD', quantity=1, price_per_unit=3.49, total_price=3.49),
#   Item(name='COBB SALAD', quantity=1, price_per_unit=5.99, total_price=5.99),
#   Item(name='PEPPER BELL RED XL EACH', quantity=1, price_per_unit=1.29, total_price=1.29),
#   Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25),
#   Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25)]
# tax=7.89
# total=41.98
# date='2023-04-01'
# payment_method='credit'

```

---

# [Overview](https://pydantic.dev/docs/ai/models/overview/)

# Overview

Pydantic AI is model-agnostic and has built-in support for multiple model providers:

-   [OpenAI](/docs/ai/models/openai)
-   [Anthropic](/docs/ai/models/anthropic)
-   [Gemini](/docs/ai/models/google) (via two different APIs: Generative Language API and VertexAI API)
-   [xAI](/docs/ai/models/xai)
-   [Bedrock](/docs/ai/models/bedrock)
-   [Cerebras](/docs/ai/models/cerebras)
-   [Cohere](/docs/ai/models/cohere)
-   [Groq](/docs/ai/models/groq)
-   [Hugging Face](/docs/ai/models/huggingface)
-   [Mistral](/docs/ai/models/mistral)
-   [OpenRouter](/docs/ai/models/openrouter)
-   [Outlines](/docs/ai/models/outlines)

## OpenAI-compatible Providers

In addition, many providers are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI:

-   [Alibaba Cloud Model Studio (DashScope)](/docs/ai/models/openai#alibaba-cloud-model-studio-dashscope)
-   [Azure AI Foundry](/docs/ai/models/openai#azure-ai-foundry)
-   [DeepSeek](/docs/ai/models/openai#deepseek)
-   [Fireworks AI](/docs/ai/models/openai#fireworks-ai)
-   [GitHub Models](/docs/ai/models/openai#github-models)
-   [Heroku](/docs/ai/models/openai#heroku-ai)
-   [LiteLLM](/docs/ai/models/openai#litellm)
-   [Nebius AI Studio](/docs/ai/models/openai#nebius-ai-studio)
-   [Ollama](/docs/ai/models/openai#ollama)
-   [OVHcloud AI Endpoints](/docs/ai/models/openai#ovhcloud-ai-endpoints)
-   [Perplexity](/docs/ai/models/openai#perplexity)
-   [SambaNova](/docs/ai/models/openai#sambanova)
-   [Together AI](/docs/ai/models/openai#together-ai)
-   [Vercel AI Gateway](/docs/ai/models/openai#vercel-ai-gateway)

Pydantic AI also comes with [`TestModel`](/docs/ai/models/test) and [`FunctionModel`](/docs/ai/models/function) for testing and development.

To use each model provider, you need to configure your local environment and make sure you have the right packages installed. If you try to use the model without having done so, you'll be told what to install.

## Models and Providers

Pydantic AI uses a few key terms to describe how it interacts with different LLMs:

-   **Model**: This refers to the Pydantic AI class used to make requests following a specific LLM API (generally by wrapping a vendor-provided SDK, like the `openai` python SDK). These classes implement a vendor-SDK-agnostic API, ensuring a single Pydantic AI agent is portable to different LLM vendors without any other code changes just by swapping out the Model it uses. Model classes are named roughly in the format `<VendorSdk>Model`, for example, we have `OpenAIChatModel`, `AnthropicModel`, `GoogleModel`, etc. When using a Model class, you specify the actual LLM model name (e.g., `gpt-5`, `claude-sonnet-4-5`, `gemini-3-flash-preview`) as a parameter.
-   **Provider**: This refers to provider-specific classes which handle the authentication and connections to an LLM vendor. Passing a non-default _Provider_ as a parameter to a Model is how you can ensure that your agent will make requests to a specific endpoint, or make use of a specific approach to authentication (e.g., you can use Azure auth with the `OpenAIChatModel` by way of the `AzureProvider`). In particular, this is how you can make use of an AI gateway, or an LLM vendor that offers API compatibility with the vendor SDK used by an existing Model (such as `OpenAIChatModel`).
-   **Profile**: This refers to a description of how requests to a specific model or family of models need to be constructed to get the best results, independent of the model and provider classes used. For example, different models have different restrictions on the JSON schemas that can be used for tools, and the same schema transformer needs to be used for Gemini models whether you're using `GoogleModel` with model name `gemini-3-pro-preview`, or `OpenAIChatModel` with `OpenRouterProvider` and model name `google/gemini-3-pro-preview`.

When you instantiate an [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) with just a name formatted as `<provider>:<model>`, e.g. `openai:gpt-5.2` or `openrouter:google/gemini-3-pro-preview`, Pydantic AI will automatically select the appropriate model class, provider, and profile. If you want to use a different provider or profile, you can instantiate a model class directly and pass in `provider` and/or `profile` arguments.

## HTTP Client Lifecycle

When a [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) creates its own HTTP client (i.e. you don't pass a custom `http_client`), it owns that client's lifecycle. Using the [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) as an async context manager ensures the HTTP client is closed cleanly on exit:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')

async def main():
    async with agent:
        result = await agent.run('What is the capital of France?')
        print(result.output)
        #> The capital of France is Paris.
```

You can also use a [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) or [`Provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.Provider) directly as an async context manager for the same effect.

If you provide your own `http_client`, you are responsible for closing it yourself.

## Custom Models

Note

If a model API is compatible with the OpenAI API, you do not need a custom model class and can provide your own [custom provider](/docs/ai/models/openai#openai-compatible-models) instead.

To implement support for a model API that's not already supported, you will need to subclass the [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) abstract base class. For streaming, you'll also need to implement the [`StreamedResponse`](/docs/ai/api/models/base/#pydantic_ai.models.StreamedResponse) abstract base class.

The best place to start is to review the source code for existing implementations, e.g. [`OpenAIChatModel`](https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py).

For details on when we'll accept contributions adding new models to Pydantic AI, see the [contributing guidelines](/docs/ai/project/contributing#new-model-rules).

## HTTP Request Concurrency

You can limit the number of concurrent HTTP requests to a model using the [`ConcurrencyLimitedModel`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimitedModel) wrapper. This is useful for respecting rate limits or managing resource usage when running many agents in parallel.

model\_concurrency.py

```python
import asyncio

from pydantic_ai import Agent, ConcurrencyLimitedModel

# Wrap a model with concurrency limiting
model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5)

# Multiple agents can share this rate-limited model
agent = Agent(model)


async def main():
    # These will be rate-limited to 5 concurrent HTTP requests
    results = await asyncio.gather(
        *[agent.run(f'Question {i}') for i in range(20)]
    )
    print(len(results))
    #> 20
```

The `limiter` parameter accepts:

-   An integer for simple limiting (e.g., `limiter=5`)
-   A [`ConcurrencyLimit`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimit) for advanced configuration with backpressure control
-   A [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) for sharing limits across multiple models

### Shared Concurrency Limits

To share a concurrency limit across multiple models (e.g., different models from the same provider), you can create a [`ConcurrencyLimiter`](/docs/ai/api/pydantic-ai/concurrency/#pydantic_ai.ConcurrencyLimiter) and pass it to multiple `ConcurrencyLimitedModel` instances:

shared\_concurrency.py

```python
import asyncio

from pydantic_ai import Agent, ConcurrencyLimitedModel, ConcurrencyLimiter

# Create a shared limiter with a descriptive name
shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool')

# Both models share the same concurrency limit
model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter)
model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter)

agent1 = Agent(model1)
agent2 = Agent(model2)


async def main():
    # Total concurrent requests across both agents limited to 10
    results = await asyncio.gather(
        *[agent1.run(f'Question {i}') for i in range(10)],
        *[agent2.run(f'Question {i}') for i in range(10)],
    )
    print(len(results))
    #> 20
```

When instrumentation is enabled, requests waiting for a concurrency slot appear as spans with attributes showing the queue depth and configured limits. The `name` parameter on `ConcurrencyLimiter` helps identify shared limiters in traces.

## Fallback Model

You can use [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel) to attempt multiple models in sequence until one succeeds. Pydantic AI can switch to the next model when the current model raises an exception (like a 4xx/5xx API error) **or** when the response content indicates a semantic failure (like a truncated response or a failed built-in tool call).

By default, fallback triggers on [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError) (4xx/5xx API errors), so you don't need to configure anything for the most common use case.

This behavior is controlled by the `fallback_on` parameter (see [`FallbackModel`](/docs/ai/api/models/fallback/#pydantic_ai.models.fallback.FallbackModel)), which accepts exception types, exception handlers, and response handlers -- all of which can be sync or async.

Note

The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating.

When using `FallbackModel`, it's recommended to disable provider SDK retries to ensure immediate fallback, for example by setting `max_retries=0` on a [custom OpenAI client](/docs/ai/models/openai#custom-openai-client).

In the following example, the agent first makes a request to the OpenAI model (which fails due to an invalid API key), and then falls back to the Anthropic model.

fallback\_model.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.openai import OpenAIChatModel

openai_model = OpenAIChatModel('gpt-5.2')
anthropic_model = AnthropicModel('claude-sonnet-4-5')
fallback_model = FallbackModel(openai_model, anthropic_model)

agent = Agent(fallback_model)
response = agent.run_sync('What is the capital of France?')
print(response.data)
#> Paris

print(response.all_messages())
"""
[
    ModelRequest(
        parts=[
            UserPromptPart(
                content='What is the capital of France?',
                timestamp=datetime.datetime(...),
                part_kind='user-prompt',
            )
        ],
        kind='request',
    ),
    ModelResponse(
        parts=[TextPart(content='Paris', part_kind='text')],
        model_name='claude-sonnet-4-5',
        timestamp=datetime.datetime(...),
        kind='response',
        provider_response_id=None,
    ),
]
"""
```

The `ModelResponse` message above indicates in the `model_name` field that the output was returned by the Anthropic model, which is the second model specified in the `FallbackModel`.

Note

Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`.

### Per-Model Settings

You can configure different [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations:

fallback\_model\_per\_settings.py

```python
from pydantic_ai import Agent, ModelSettings
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.openai import OpenAIChatModel

# Configure each model with provider-specific optimal settings
openai_model = OpenAIChatModel(
    'gpt-5.2',
    settings=ModelSettings(temperature=0.7, max_tokens=1000)  # Higher creativity for OpenAI
)
anthropic_model = AnthropicModel(
    'claude-sonnet-4-5',
    settings=ModelSettings(temperature=0.2, max_tokens=1000)  # Lower temperature for consistency
)

fallback_model = FallbackModel(openai_model, anthropic_model)
agent = Agent(fallback_model)

result = agent.run_sync('Write a creative story about space exploration')
print(result.output)
"""
In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it.
"""
```

In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request.

### Exception Handling

The next example demonstrates the exception-handling capabilities of `FallbackModel`. If all models fail, a [`FallbackExceptionGroup`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.FallbackExceptionGroup) is raised, which contains all the exceptions encountered during the `run` execution.

-   [Python >=3.11](#tab-panel-54)
-   [Python <3.11](#tab-panel-55)

fallback\_model\_failure.py

```python
from pydantic_ai import Agent, ModelAPIError
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.openai import OpenAIChatModel

openai_model = OpenAIChatModel('gpt-5.2')
anthropic_model = AnthropicModel('claude-sonnet-4-5')
fallback_model = FallbackModel(openai_model, anthropic_model)

agent = Agent(fallback_model)
try:
    response = agent.run_sync('What is the capital of France?')
except* ModelAPIError as exc_group:
    for exc in exc_group.exceptions:
        print(exc)
```

Since [`except*`](https://docs.python.org/3/reference/compound_stmts.html#except-star) is only supported in Python 3.11+, we use the [`exceptiongroup`](https://github.com/agronholm/exceptiongroup) backport package for earlier Python versions:

fallback\_model\_failure.py

```python
from exceptiongroup import catch

from pydantic_ai import Agent, ModelAPIError
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.openai import OpenAIChatModel


def model_status_error_handler(exc_group: BaseExceptionGroup) -> None:
    for exc in exc_group.exceptions:
        print(exc)


openai_model = OpenAIChatModel('gpt-5.2')
anthropic_model = AnthropicModel('claude-sonnet-4-5')
fallback_model = FallbackModel(openai_model, anthropic_model)

agent = Agent(fallback_model)
with catch({ModelAPIError: model_status_error_handler}):
    response = agent.run_sync('What is the capital of France?')
```

By default, the `FallbackModel` only moves on to the next model if the current model raises a [`ModelAPIError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelAPIError), which includes [`ModelHTTPError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelHTTPError). You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor.

Note

Validation errors (from [structured output](/docs/ai/core-concepts/output#structured-output) or [tool parameters](/docs/ai/tools-toolsets/tools)) do **not** trigger fallback. These errors use the [retry mechanism](/docs/ai/core-concepts/agent#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request.

### Response-Based Fallback

In addition to exception-based fallback, you can also trigger fallback based on the **content** of a model's response. This is useful when a model returns a successful HTTP response (no exception), but the response content indicates a semantic failure -- for example, an unexpected finish reason or a built-in tool reporting failure.

Non-streaming only

Response-based fallback currently only works with non-streaming requests (`agent.run()` and `agent.run_sync()`). For streaming requests (`agent.run_stream()`), only exception-based fallback is supported.

The `fallback_on` parameter accepts:

-   A tuple of exception types: `(ModelAPIError, ModelHTTPError)`
-   An exception handler (sync or async): `lambda exc: isinstance(exc, MyError)`
-   A response handler (sync or async): `def check(r: ModelResponse) -> bool`
-   A list mixing all of the above: `[ModelAPIError, exc_handler, response_handler]`

Handler type is auto-detected by inspecting type hints on the first parameter. If the first parameter is hinted as [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse), it's a response handler. Otherwise (including untyped handlers and lambdas), it's an exception handler.

#### Finish Reason Example

A simple use case is checking the model's finish reason -- for example, falling back if the response was truncated due to length limits:

fallback\_on\_finish\_reason.py

```python
from pydantic_ai import Agent
from pydantic_ai.messages import FinishReason, ModelResponse
from pydantic_ai.models.fallback import FallbackModel


def bad_finish_reason(response: ModelResponse) -> bool:
    """Fallback if the model stopped due to length limit, content filter, or error."""
    reason: FinishReason | None = response.finish_reason
    # Trigger fallback for problematic finish reasons
    return reason in ('length', 'content_filter', 'error')


fallback_model = FallbackModel(
    'openai:gpt-5.2',
    'anthropic:claude-sonnet-4-5',
    fallback_on=bad_finish_reason,
)

agent = Agent(fallback_model)
result = agent.run_sync('What is the capital of France?')
print(result.output)
#> The capital of France is Paris.
```

Solo response handlers replace default exception fallback

When you pass a single response handler as `fallback_on` (as above), it **replaces** the default `(ModelAPIError,)` exception fallback entirely. This means API errors (4xx/5xx) will propagate as exceptions instead of triggering fallback to the next model.

To keep exception-based fallback alongside a response handler, pass them together as a list -- see the [mixed example below](#combining-handlers).

Note

Note that Pydantic AI already handles some finish reasons automatically in the [agent loop](/docs/ai/core-concepts/agent): responses with a `'length'` or `'content_filter'` finish reason raise exceptions (which `FallbackModel` catches by default), and empty responses are retried. A response handler is useful for custom checks beyond these built-in behaviors.

#### Built-in Tool Failure Example

A more complex use case is when using built-in tools like web search or URL fetching. For example, Google's [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool) may return a successful response with a status indicating the URL fetch failed:

fallback\_on\_builtin\_tool.py

```python
from pydantic_ai import Agent
from pydantic_ai.messages import ModelResponse
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.fallback import FallbackModel
from pydantic_ai.models.google import GoogleModel


def web_fetch_failed(response: ModelResponse) -> bool:
    """Check if a web_fetch built-in tool failed to retrieve content."""
    for call, result in response.builtin_tool_calls:
        if call.tool_name != 'web_fetch':
            continue
        if not isinstance(result.content, list):
            continue
        for item in result.content:
            if isinstance(item, dict):
                status = item.get('url_retrieval_status', '')
                if status and status != 'URL_RETRIEVAL_STATUS_SUCCESS':
                    return True
    return False


google_model = GoogleModel('gemini-2.5-flash')
anthropic_model = AnthropicModel('claude-sonnet-4-5')

# Auto-detected as response handler via type hint
fallback_model = FallbackModel(
    google_model,
    anthropic_model,
    fallback_on=web_fetch_failed,
)

agent = Agent(fallback_model)

# If Google's web_fetch fails, automatically falls back to Anthropic
result = agent.run_sync('Summarize https://ai.pydantic.dev')
print(result.output)
"""
Pydantic AI is a Python agent framework for building production-grade LLM applications.
"""
```

Response handlers receive the [`ModelResponse`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse) returned by the model and should return `True` to trigger fallback to the next model, or `False` to accept the response.

#### Combining Handlers

You can combine exception types, exception handlers, and response handlers in a single list:

fallback\_on\_mixed.py

```python
from pydantic_ai.exceptions import ModelAPIError
from pydantic_ai.models.fallback import FallbackModel

from fallback_on_builtin_tool import anthropic_model, google_model, web_fetch_failed

fallback_model = FallbackModel(
    google_model,
    anthropic_model,
    fallback_on=[
        ModelAPIError,  # Exception type
        lambda exc: 'rate limit' in str(exc).lower(),  # Exception handler (untyped lambda)
        web_fetch_failed,  # Response handler (auto-detected via type hint)
    ],
)
```

### Exception Handling in Middleware and Decorators

When using `FallbackModel`, it's important to understand that [`FallbackExceptionGroup`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.FallbackExceptionGroup) inherits from Python's [`ExceptionGroup`](https://docs.python.org/3/library/exceptions.html#ExceptionGroup). This means that existing exception handling code that catches specific exceptions (like `ModelAPIError`) won't automatically catch the individual exceptions wrapped inside the group.

For example, if you have middleware or a decorator that catches `ModelAPIError`:

middleware\_without\_fallback.py

```python
from collections.abc import Callable
from functools import wraps
from typing import TypeVar

from pydantic_ai import ModelAPIError

T = TypeVar('T')


# This handler will NOT catch ModelAPIError when using FallbackModel!
def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]:
    @wraps(func)
    def wrapper(*args, **kwargs) -> T:
        try:
            return func(*args, **kwargs)
        except ModelAPIError as e:  # Won't catch FallbackExceptionGroup
            print(f'API error: {e}')
            raise

    return wrapper
```

This decorator will miss `ModelAPIError` exceptions when using `FallbackModel`, because they're wrapped in a `FallbackExceptionGroup` containing one exception per failed model, in the order the models were tried.

To handle both cases, you can use Python 3.11+ `except*` syntax, which catches matching exceptions from exception groups as well as bare exceptions. Note that `except*` always delivers the caught exceptions as an `ExceptionGroup` (even if the original was a bare exception), so re-raising will propagate an `ExceptionGroup` rather than the original exception type:

-   [Python >=3.11](#tab-panel-56)
-   [Python <3.11](#tab-panel-57)

middleware\_with\_fallback.py

```python
from collections.abc import Callable
from functools import wraps
from typing import TypeVar

from pydantic_ai import ModelAPIError

T = TypeVar('T')


def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]:
    @wraps(func)
    def wrapper(*args, **kwargs) -> T:
        try:
            return func(*args, **kwargs)
        except* ModelAPIError as exc_group:
            for exc in exc_group.exceptions:
                print(f'API error: {exc}')
            raise

    return wrapper
```

middleware\_with\_fallback.py

```python
from collections.abc import Callable
from functools import wraps
from typing import TypeVar

from pydantic_ai import FallbackExceptionGroup, ModelAPIError

T = TypeVar('T')


def handle_api_errors(func: Callable[..., T]) -> Callable[..., T]:
    @wraps(func)
    def wrapper(*args, **kwargs) -> T:
        try:
            return func(*args, **kwargs)
        except FallbackExceptionGroup as exc_group:
            for exc in exc_group.exceptions:
                if isinstance(exc, ModelAPIError):
                    print(f'API error from fallback: {exc}')
            raise
        except ModelAPIError as e:
            print(f'API error: {e}')
            raise

    return wrapper
```

You can also catch `FallbackExceptionGroup` directly if you want to handle it specifically:

catch\_fallback\_exception\_group.py

```python
from pydantic_ai import Agent, FallbackExceptionGroup
from pydantic_ai.models.fallback import FallbackModel

agent = Agent(FallbackModel('openai:gpt-5-mini', 'anthropic:claude-sonnet-4-6'))

try:
    response = agent.run_sync('What is the capital of France?')
except FallbackExceptionGroup as exc_group:
    print(f'All {len(exc_group.exceptions)} models failed:')
    for exc in exc_group.exceptions:
        print(f'  - {exc}')
```

---

# [xAI](https://pydantic.dev/docs/ai/models/xai/)

# xAI

## Install

To use [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel), you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `xai` optional group:

-   [pip](#tab-panel-64)
-   [uv](#tab-panel-65)

Terminal

```bash
pip install "pydantic-ai-slim[xai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[xai]"
```

## Streaming cancellation

Cancellation limitations

The `xai-sdk` SDK exposes streaming responses only as an async iterator, with no separate handle for cancelling the underlying gRPC call. Because of a [Python language rule on async generators](https://peps.python.org/pep-0525/), [`cancel()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.cancel) cannot interrupt an in-flight chunk read while another coroutine is iterating the stream. Pydantic AI marks the response with `state='interrupted'`, but upstream generation may continue until the surrounding `async with agent.run_stream(...)` block exits.

For reliable cancellation, either pass `debounce_by=None` to [`stream_text()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_text), [`stream_output()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_output), or [`stream_responses()`](/docs/ai/api/pydantic-ai/result/#pydantic_ai.result.StreamedRunResult.stream_responses) and call `cancel()` from the same task that's iterating:

cancel\_xai.py

```python
from pydantic_ai import Agent

agent = Agent('xai:grok-4-1-fast-non-reasoning')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async for chunk in result.stream_text(debounce_by=None):
            if should_stop(chunk):
                await result.cancel()
                break
```

Or, if you need to keep debouncing, wrap the stream with [`contextlib.aclosing`](https://docs.python.org/3/library/contextlib.html#contextlib.aclosing) so the iterator is closed before `cancel()` runs:

cancel\_xai\_aclosing.py

```python
from contextlib import aclosing

from pydantic_ai import Agent

agent = Agent('xai:grok-4-1-fast-non-reasoning')


def should_stop(chunk: str) -> bool:
    return len(chunk) > 100


async def main():
    async with agent.run_stream('Write a long essay about Python') as result:
        async with aclosing(result.stream_text()) as stream:
            async for chunk in stream:
                if should_stop(chunk):
                    break
        await result.cancel()
```

Calling `cancel()` from a different task while iteration is in progress is not currently reliable on this provider.

## Configuration

To use xAI models from [xAI](https://x.ai/api) through their API, go to [console.x.ai](https://console.x.ai/team/default/api-keys) to create an API key.

[docs.x.ai](https://docs.x.ai/docs/models) contains a list of available xAI models.

## Environment variable

Once you have the API key, you can set it as an environment variable:

Terminal

```bash
export XAI_API_KEY='your-api-key'
```

You can then use [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) by name:

```python
from pydantic_ai import Agent

agent = Agent('xai:grok-4-1-fast-non-reasoning')
...
```

Or initialise the model directly:

```python
from pydantic_ai import Agent
from pydantic_ai.models.xai import XaiModel

# Uses XAI_API_KEY environment variable
model = XaiModel('grok-4-1-fast-non-reasoning')
agent = Agent(model)
...
```

You can also customize the [`XaiModel`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModel) with a custom provider:

```python
from pydantic_ai import Agent
from pydantic_ai.models.xai import XaiModel
from pydantic_ai.providers.xai import XaiProvider

# Custom API key
provider = XaiProvider(api_key='your-api-key')
model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider)
agent = Agent(model)
...
```

Or with a custom `xai_sdk.AsyncClient`:

```python
from xai_sdk import AsyncClient

from pydantic_ai import Agent
from pydantic_ai.models.xai import XaiModel
from pydantic_ai.providers.xai import XaiProvider

xai_client = AsyncClient(api_key='your-api-key')
provider = XaiProvider(xai_client=xai_client)
model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider)
agent = Agent(model)
...
```

## X Search

xAI models support searching X (formerly Twitter) for real-time posts and content. The recommended way to enable it is with the [`XSearch`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XSearch) capability, which configures the underlying `x_search` builtin tool and can be passed alongside any other capabilities on the agent. See the [xAI X Search documentation](https://docs.x.ai/developers/tools/x-search) for the full list of supported options.

xai\_x\_search.py

```py
from datetime import datetime

from pydantic_ai import Agent
from pydantic_ai.models.xai import XSearch

agent = Agent(
    'xai:grok-4-1-fast',
    capabilities=[
        XSearch(
            allowed_x_handles=['OpenAI', 'AnthropicAI', 'dasfacc'],
            from_date=datetime(2024, 1, 1),
            to_date=datetime(2024, 12, 31),
            enable_image_understanding=True,
            enable_video_understanding=True,
            include_output=True,
        )
    ],
)

result = agent.run_sync('What have AI companies been posting about?')
print(result.output)
"""
OpenAI announced their latest model updates, while Anthropic shared research on AI safety...
"""
```

_(This example is complete, it can be run "as is")_

The `XSearch` capability accepts:

-   **`allowed_x_handles`** / **`excluded_x_handles`**: filter results to (or away from) up to 10 X handles. These are mutually exclusive.
-   **`from_date`** / **`to_date`**: restrict results to posts created within the given datetime range (naive datetimes are interpreted as UTC).
-   **`enable_image_understanding`** (default: `False`): analyze images attached to posts.
-   **`enable_video_understanding`** (default: `False`): analyze video content attached to posts.
-   **`include_output`** (default: `False`): include the raw X search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls). Without this, the model uses the search results internally but only returns its text summary; enabling it gives programmatic access to the searched posts, sources, and metadata.

As an alternative to the capability, you can pass the lower-level [`XSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.XSearchTool) directly via `builtin_tools=[...]` -- see the [X Search Tool documentation](/docs/ai/tools-toolsets/builtin-tools#x-search-tool) -- or enable raw output globally via the [`XaiModelSettings.xai_include_x_search_output`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_x_search_output) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

---

# [Coding Agent Skills](https://pydantic.dev/docs/ai/overview/coding-agent-skills/)

# Coding Agent Skills

If you're building Pydantic AI applications with a coding agent, you can install the Pydantic AI skill from the [`pydantic/skills`](https://github.com/pydantic/skills) repository to give your agent up-to-date framework knowledge.

[Agent skills](https://agentskills.io) are packages of instructions and reference material that coding agents load on demand. With the skill installed, coding agents have access to Pydantic AI patterns, architecture guidance, and common task references covering [tools](/docs/ai/tools-toolsets/tools), [capabilities](/docs/ai/core-concepts/capabilities), [structured output](/docs/ai/core-concepts/output), [streaming](/docs/ai/core-concepts/agent#streaming-events-and-final-output), [testing](/docs/ai/guides/testing), [multi-agent delegation](/docs/ai/guides/multi-agent-applications), [hooks](/docs/ai/core-concepts/hooks), and [agent specs](/docs/ai/core-concepts/agent-spec).

Note

If you want to build agent skills for your Pydantic AI agent, see the [Agent Skills](/docs/ai/core-concepts/capabilities#agent-skills) entry in the Third-party capabilities section on the Capabilities page.

## Installation

### Claude Code

Add the Pydantic marketplace and install the plugin:

Terminal

```bash
claude plugin marketplace add pydantic/skills
claude plugin install ai@pydantic-skills
```

### Cross-Agent (agentskills.io)

Install the Pydantic AI skill using the [skills CLI](https://github.com/vercel-labs/skills):

Terminal

```bash
npx skills add pydantic/skills
```

This works with 30+ agents via the [agentskills.io](https://agentskills.io) standard, including Claude Code, Codex, Cursor, and Gemini CLI.

### Library Skills

Pydantic AI also ships its skill bundled with the package, so you can install it directly from your project's dependencies via [library-skills.io](https://library-skills.io):

Terminal

```bash
uvx library-skills --all
```

The `--all` flag is required because the skill is bundled in `pydantic-ai-slim`, which is a transitive dependency of the `pydantic-ai` meta-package. Without it, `library-skills` only scans direct dependencies and won't discover the skill.

Add `--claude` to install into `.claude/skills` instead of the default `.agents` directory.

## See Also

-   [`pydantic/skills`](https://github.com/pydantic/skills): source repository
-   [agentskills.io](https://agentskills.io): the open standard for agent skills
-   [library-skills.io](https://library-skills.io): install agent skills bundled with your project's dependencies

---

# [Pydantic AI Gateway](https://pydantic.dev/docs/ai/overview/gateway/)

# Pydantic AI Gateway

**[Pydantic AI Gateway](https://logfire.pydantic.dev/)** is a unified interface for accessing multiple AI providers with a single key, managed through [Pydantic Logfire](https://logfire.pydantic.dev/). Features include built-in OpenTelemetry observability, real-time cost monitoring, failover management, and native integration with the other tools in the [Pydantic stack](https://pydantic.dev/).

Migrated to Pydantic Logfire

The AI Gateway has moved from `gateway.pydantic.dev` to [Pydantic Logfire](https://logfire.pydantic.dev/). If you were using the standalone gateway, see [Pydantic AI Gateway is Moving to Pydantic Logfire](https://logfire.pydantic.dev/docs/gateway-migration/).

Sign up at [logfire.pydantic.dev](https://logfire.pydantic.dev/).

Questions?

For questions and feedback, contact us on [Slack](https://logfire.pydantic.dev/docs/join-slack/).

## Documentation Integration

To help you get started with Pydantic AI Gateway, some code examples on the Pydantic AI documentation include a "Via Pydantic AI Gateway" tab, alongside a "Direct to Provider API" tab with the standard Pydantic AI model string. The main difference between them is that when using Gateway, model strings use the `gateway/` prefix.

## Key features

-   **API key management**: Access multiple LLM providers with a single Gateway key.
-   **Cost Limits**: Set spending limits at project, user, and API key levels with daily, weekly, and monthly caps.
-   **BYOK and managed providers:** Bring your own API keys (BYOK) from LLM providers, or pay for inference directly through the platform.
-   **Multi-provider support:** Access models from OpenAI, Anthropic, Google Vertex, Groq, and AWS Bedrock. _More providers coming soon_.
-   **Routing groups:** Configure [routing groups](#routing-groups) to fail over between providers serving the same model, or load-balance traffic across them by weight.
-   **Backend observability:** Log every request through [Pydantic Logfire](https://pydantic.dev/logfire) or any OpenTelemetry backend (_coming soon_).
-   **Zero translation**: Unlike traditional AI gateways that translate everything to one common schema, **Pydantic AI Gateway** allows requests to flow through directly in each provider's native format. This gives you immediate access to new model features as soon as they are released.
-   **Enterprise ready**: Inherits Logfire's enterprise features -- including SSO, custom roles and permissions.

hello\_world.py

```python
from pydantic_ai import Agent

agent = Agent('gateway/openai:gpt-5.2')

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

## Quick Start

This section contains instructions on how to set up your account and run your app with Pydantic AI Gateway credentials.

### Create an account

1.  Sign up at [logfire.pydantic.dev](https://logfire.pydantic.dev/)
2.  Choose a region and create an account.
3.  Activate the gateway in your organizations settings.

### Create Gateway API keys

Go to your organization's Gateway settings in Logfire and create an API key.

## Usage

After setting up your account with the instructions above, you will be able to make an AI model request with the Pydantic AI Gateway. The code snippets below show how you can use Pydantic AI Gateway with different frameworks and SDKs.

To use different models, change the model string `gateway/<api_format>:<model_name>` to other models offered by the supported providers.

Examples of providers and models that can be used are:

**Provider**

**API Format**

**Example Model**

OpenAI

`openai`

`gateway/openai:gpt-5.2`

Anthropic

`anthropic`

`gateway/anthropic:claude-sonnet-4-6`

Google Vertex

`google-vertex`

`gateway/google-vertex:gemini-3-flash-preview`

Groq

`groq`

`gateway/groq:openai/gpt-oss-120b`

AWS Bedrock

`bedrock`

`gateway/bedrock:amazon.nova-micro-v1:0`

### Pydantic AI

Before you start, make sure you are on version 1.16 or later of `pydantic-ai`. To update to the latest version run:

-   [uv](#tab-panel-94)
-   [pip](#tab-panel-95)

Terminal

```bash
uv sync -P pydantic-ai
```

Terminal

```bash
pip install -U pydantic-ai
```

Set the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable to your Gateway API key:

Terminal

```bash
export PYDANTIC_AI_GATEWAY_API_KEY="pylf_v..."
```

You can access multiple models with the same API key, as shown in the code snippet below.

hello\_world.py

```python
from pydantic_ai import Agent

agent = Agent('gateway/openai:gpt-5.2')

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

#### Passing API Key directly

Pass your API key directly using the [`gateway_provider`](/docs/ai/api/pydantic-ai/providers/#pydantic_ai.providers.gateway.gateway_provider):

passing\_api\_key.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.gateway import gateway_provider

provider = gateway_provider('openai', api_key='pylf_v...')
model = OpenAIChatModel('gpt-5.2', provider=provider)
agent = Agent(model)

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

#### Using a different upstream provider

To use an alternate provider or routing group, you can specify it in the route parameter:

routing\_via\_provider.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.gateway import gateway_provider

provider = gateway_provider(
    'openai',
    api_key='pylf_v...',
    route='builtin-openai'
)
model = OpenAIChatModel('gpt-5.2', provider=provider)
agent = Agent(model)

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

### Claude Code

Before you start, log out of Claude Code using `/logout`.

Set your gateway credentials as environment variables, using the base URL that matches your Logfire region:

-   [US](#tab-panel-96)
-   [EU](#tab-panel-97)

Terminal

```bash
export ANTHROPIC_BASE_URL="https://gateway-us.pydantic.dev/proxy/anthropic"
export ANTHROPIC_AUTH_TOKEN="YOUR_GATEWAY_API_KEY"
```

Terminal

```bash
export ANTHROPIC_BASE_URL="https://gateway-eu.pydantic.dev/proxy/anthropic"
export ANTHROPIC_AUTH_TOKEN="YOUR_GATEWAY_API_KEY"
```

Replace `YOUR_GATEWAY_API_KEY` with the API key from your Logfire organization's Gateway settings.

Launch Claude Code by typing `claude`. All requests will now route through the Pydantic AI Gateway.

### Codex

Codex uses the OpenAI Responses API, so it should use the Gateway's `openai-responses` route.

Set your gateway API key as an environment variable:

Terminal

```bash
export PYDANTIC_AI_GATEWAY_API_KEY="YOUR_GATEWAY_API_KEY"
```

Then add the following configuration to `~/.codex/config.toml`, using the base URL that matches your Logfire region:

-   [US](#tab-panel-98)
-   [EU](#tab-panel-99)

```toml
model = "gpt-5.4"
model_provider = "pydantic_gateway"

[model_providers.pydantic_gateway]
name = "Pydantic AI Gateway"
base_url = "https://gateway-us.pydantic.dev/proxy/openai-responses"
env_key = "PYDANTIC_AI_GATEWAY_API_KEY"
env_key_instructions = "Create a Gateway API key in your Logfire organization's Gateway settings."
wire_api = "responses"
```

```toml
model = "gpt-5.4"
model_provider = "pydantic_gateway"

[model_providers.pydantic_gateway]
name = "Pydantic AI Gateway"
base_url = "https://gateway-eu.pydantic.dev/proxy/openai-responses"
env_key = "PYDANTIC_AI_GATEWAY_API_KEY"
env_key_instructions = "Create a Gateway API key in your Logfire organization's Gateway settings."
wire_api = "responses"
```

For more details on configuring custom providers in Codex, see the [Codex custom model providers docs](https://developers.openai.com/codex/config-advanced#custom-model-providers) and the [Codex configuration reference](https://developers.openai.com/codex/config-reference/).

If you already have a `~/.codex/config.toml`, add the `[model_providers.pydantic_gateway]` block and update `model_provider` instead of replacing the whole file. Replace `gpt-5.4` with whichever OpenAI Responses model you want Codex to use.

Launch Codex by typing `codex`. All requests will now route through the Pydantic AI Gateway.

### SDKs

#### OpenAI SDK

Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`).

-   [US](#tab-panel-100)
-   [EU](#tab-panel-101)

openai\_sdk.py

```python
import openai

client = openai.Client(
    base_url='https://gateway-us.pydantic.dev/proxy/chat/',
    api_key='pylf_v...',
)

response = client.chat.completions.create(
    model='gpt-5.2',
    messages=[{'role': 'user', 'content': 'Hello world'}],
)
print(response.choices[0].message.content)
#> Hello user
```

openai\_sdk.py

```python
import openai

client = openai.Client(
    base_url='https://gateway-eu.pydantic.dev/proxy/chat/',
    api_key='pylf_v...',
)

response = client.chat.completions.create(
    model='gpt-5.2',
    messages=[{'role': 'user', 'content': 'Hello world'}],
)
print(response.choices[0].message.content)
#> Hello user
```

#### Anthropic SDK

Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`).

-   [US](#tab-panel-102)
-   [EU](#tab-panel-103)

anthropic\_sdk.py

```python
import anthropic

client = anthropic.Anthropic(
    base_url='https://gateway-us.pydantic.dev/proxy/anthropic/',
    auth_token='pylf_v...',
)

response = client.messages.create(
    max_tokens=1000,
    model='claude-sonnet-4-5',
    messages=[{'role': 'user', 'content': 'Hello world'}],
)
print(response.content[0].text)
#> Hello user
```

anthropic\_sdk.py

```python
import anthropic

client = anthropic.Anthropic(
    base_url='https://gateway-eu.pydantic.dev/proxy/anthropic/',
    auth_token='pylf_v...',
)

response = client.messages.create(
    max_tokens=1000,
    model='claude-sonnet-4-5',
    messages=[{'role': 'user', 'content': 'Hello world'}],
)
print(response.content[0].text)
#> Hello user
```

#### Vercel AI SDK

The [Vercel AI SDK](https://ai-sdk.dev/) can route through the Gateway by pointing each provider's `baseURL` at the matching proxy path (e.g. `/proxy/openai` or `/proxy/anthropic`). Use the base URL that matches your Logfire region (`gateway-us` or `gateway-eu`).

-   [US](#tab-panel-104)
-   [EU](#tab-panel-105)

```typescript
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

const apiKey = process.env.PYDANTIC_AI_GATEWAY_API_KEY;
if (!apiKey) throw new Error("set PYDANTIC_AI_GATEWAY_API_KEY");

const openai = createOpenAI({
  apiKey,
  baseURL: "https://gateway-us.pydantic.dev/proxy/openai",
});

async function main() {
  const openaiResult = await generateText({
    model: openai("gpt-5.2"),
    prompt: "what color is the sky? reply concisely",
  });
  console.log("openai:", openaiResult.text);
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});
```

```typescript
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

const apiKey = process.env.PYDANTIC_AI_GATEWAY_API_KEY;
if (!apiKey) throw new Error("set PYDANTIC_AI_GATEWAY_API_KEY");

const openai = createOpenAI({
  apiKey,
  baseURL: "https://gateway-eu.pydantic.dev/proxy/openai",
});

async function main() {
  const openaiResult = await generateText({
    model: openai("gpt-5.2"),
    prompt: "what color is the sky? reply concisely",
  });
  console.log("openai:", openaiResult.text);
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});
```

## Routing groups

A **routing group** is a named collection of providers that all serve the same model. Each member has a **priority**, a **weight**, and an **active** flag, and those three values together let a single group express two different routing strategies:

-   **Failover / fallback**: Assign members different priorities. The Gateway always tries the highest-priority active member first, and only falls through to a lower-priority member when the higher one is unavailable (for example if it is down, rate-limited, or returns an error).
-   **Load balancing**: Assign two or more members the same priority and give each a weight. The Gateway splits traffic across those members in proportion to their weights.

The two strategies compose: you can have, for example, a top priority tier with two providers load-balanced 70/30, and a second priority tier that only receives traffic when both top-tier providers fail.

### Creating a routing group

Routing groups are managed from your organization's Gateway settings in Logfire:

1.  Open **Gateway -> Routing Groups** and click **Add Routing Group**.
2.  Give the group a slug (e.g. `anthropic-routing`) and an optional description.
3.  Open the group's **Members** page and add one or more providers. For each member set:
    -   **Priority** - higher values are tried first. Use different priorities across members for failover.
    -   **Weight** - load-balancing weight used between members that share the same priority.
    -   **Active** - inactive members are skipped during routing.

### Using a routing group

Point the Gateway provider at the group via the `route` parameter (the group's slug):

routing\_group.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.gateway import gateway_provider

provider = gateway_provider(
  'anthropic',
  api_key='pylf_v...',
  route='anthropic-routing',  # (1)
)
model = AnthropicModel('claude-sonnet-4-6', provider=provider)
agent = Agent(model)

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

The slug of the routing group you created in Logfire.

## Troubleshooting

### Unable to calculate spend

The gateway needs to know the cost of the request in order to provide insights about the spend, and to enforce spending limits. If it's unable to calculate the cost, it will return a 400 error with the message "Unable to calculate spend".

When configuring a provider, you need to decide if you want the gateway to block the API key if it's unable to calculate the cost. If you choose to block the API key, any further requests using that API key will fail.

We are actively working on supporting more providers, and models. If you have a specific provider that you would like to see supported, please let us know on [Slack](https://logfire.pydantic.dev/docs/join-slack/) or [open an issue on `genai-prices`](https://github.com/pydantic/genai-prices/issues/new).

---

# [Getting Help](https://pydantic.dev/docs/ai/overview/help/)

# Getting Help

If you need help getting started with Pydantic AI or with advanced usage, the following sources may be useful.

## Slack

Join the `#pydantic-ai` channel in the [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) to ask questions, get help, and chat about Pydantic AI. There's also channels for Pydantic, Logfire, and FastUI.

If you're on a [Logfire](https://pydantic.dev/logfire) Pro plan, you can also get a dedicated private slack collab channel with us.

## GitHub Issues

The [Pydantic AI GitHub Issues](https://github.com/pydantic/pydantic-ai/issues) are a great place to ask questions and give us feedback.

---

# [Pydantic AI](https://pydantic.dev/docs/ai/overview/)

# Pydantic AI

![Pydantic AI](/docs/ai/img/pydantic-ai-dark.svg)

![Pydantic AI](/docs/ai/img/pydantic-ai-light.svg)

_GenAI Agent Framework, the Pydantic way_

[![CI](https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml/badge.svg?event=push)](https://github.com/pydantic/pydantic-ai/actions/workflows/ci.yml?query=branch%3Amain)[![Coverage](https://coverage-badge.samuelcolvin.workers.dev/pydantic/pydantic-ai.svg)](https://coverage-badge.samuelcolvin.workers.dev/redirect/pydantic/pydantic-ai)[![PyPI](https://img.shields.io/pypi/v/pydantic-ai.svg)](https://pypi.python.org/pypi/pydantic-ai)[![versions](https://img.shields.io/pypi/pyversions/pydantic-ai.svg)](https://github.com/pydantic/pydantic-ai)[![license](https://img.shields.io/github/license/pydantic/pydantic-ai.svg)](https://github.com/pydantic/pydantic-ai/blob/main/LICENSE)[![Join Slack](https://img.shields.io/badge/Slack-Join%20Slack-4A154B?logo=slack)](https://logfire.pydantic.dev/docs/join-slack/)

Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI.

FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of [Pydantic Validation](https://docs.pydantic.dev) and modern Python features like type hints.

Yet despite virtually every Python agent framework and LLM library using Pydantic Validation, when we began to use LLMs in [Pydantic Logfire](https://pydantic.dev/logfire), we couldn't find anything that gave us the same feeling.

We built Pydantic AI with one simple aim: to bring that FastAPI feeling to GenAI app and agent development.

## Why use Pydantic AI

1.  **Built by the Pydantic Team**: [Pydantic Validation](https://docs.pydantic.dev/latest/) is the validation layer of the OpenAI SDK, the Google ADK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more. _Why use the derivative when you can go straight to the source?_ 😃
    
2.  **Model-agnostic**: Supports virtually every [model](/docs/ai/models/overview) and provider: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, and Perplexity; Azure AI Foundry, Amazon Bedrock, Google Vertex AI, Ollama, LiteLLM, Groq, OpenRouter, Together AI, Fireworks AI, Cerebras, Hugging Face, GitHub, Heroku, Vercel, Nebius, OVHcloud, Alibaba Cloud, SambaNova, and Outlines. If your favorite model or provider is not listed, you can easily implement a [custom model](/docs/ai/models/overview#custom-models).
    
3.  **Seamless Observability**: Tightly [integrates](/docs/ai/integrations/logfire) with [Pydantic Logfire](https://pydantic.dev/logfire), our general-purpose OpenTelemetry observability platform, for real-time debugging, evals-based performance monitoring, and behavior, tracing, and cost tracking. If you already have an observability platform that supports OTel, you can [use that too](/docs/ai/integrations/logfire#alternative-observability-backends).
    
4.  **Fully Type-safe**: Designed to give your IDE or AI coding agent as much context as possible for auto-completion and [type checking](/docs/ai/core-concepts/agent#static-type-checking), moving entire classes of errors from runtime to write-time for a bit of that Rust "if it compiles, it works" feel.
    
5.  **Powerful Evals**: Enables you to systematically test and [evaluate](/docs/ai/evals/) the performance and accuracy of the agentic systems you build, and monitor the performance over time in Pydantic Logfire.
    
6.  **Extensible by Design**: Build agents from composable [capabilities](/docs/ai/core-concepts/capabilities) that bundle tools, hooks, instructions, and model settings into reusable units. Use built-in capabilities for [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), [thinking](/docs/ai/core-concepts/capabilities#thinking), and [MCP](/docs/ai/core-concepts/capabilities#provider-adaptive-tools), pick from the [Pydantic AI Harness](/docs/ai/harness/overview) capability library, build your own, or install [third-party capability packages](/docs/ai/guides/extensibility). Define agents entirely in [YAML/JSON](/docs/ai/core-concepts/agent-spec) -- no code required.
    
7.  **MCP, A2A, and UI**: Integrates the [Model Context Protocol](/docs/ai/mcp/overview), [Agent2Agent](/docs/ai/integrations/a2a), and various [UI event stream](/docs/ai/integrations/ui/overview) standards to give your agent access to external tools and data, let it interoperate with other agents, and build interactive applications with streaming event-based communication.
    
8.  **Human-in-the-Loop Tool Approval**: Easily lets you flag that certain tool calls [require approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) before they can proceed, possibly depending on tool call arguments, conversation history, or user preferences.
    
9.  **Durable Execution**: Enables you to build [durable agents](/docs/ai/integrations/durable_execution/overview) that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability.
    
10.  **Streamed Outputs**: Provides the ability to [stream](/docs/ai/core-concepts/output#streamed-results) structured output continuously, with immediate validation, ensuring real time access to generated data.
     
11.  **Graph Support**: Provides a powerful way to define [graphs](/docs/ai/graph/) using type hints, for use in complex applications where standard control flow can degrade to spaghetti code.
     

Realistically though, no list is going to be as convincing as [giving it a try](#next-steps) and seeing how it makes you feel!

**Sign up for our newsletter, _The Pydantic Stack_, with updates & tutorials on Pydantic AI, Logfire, and Pydantic:**

Subscribe

## Hello World Example

Here's a minimal example of Pydantic AI:

hello\_world.py

```python
from pydantic_ai import Agent

agent = Agent(  # (1)
  'anthropic:claude-sonnet-4-6',
  instructions='Be concise, reply with one sentence.',  # (2)
)

result = agent.run_sync('Where does "hello world" come from?')  # (3)
print(result.output)
"""
The first known use of "hello, world" was in a 1974 textbook about the C programming language.
"""
```

We configure the agent to use [Anthropic's Claude Sonnet 4.6](/docs/ai/models/anthropic) model, but you can also set the model when running the agent.

Register static [instructions](/docs/ai/core-concepts/agent#instructions) using a keyword argument to the agent.

[Run the agent](/docs/ai/core-concepts/agent#running-agents) synchronously, starting a conversation with the LLM.

_(This example is complete, it can be run "as is", assuming you've [installed the `pydantic_ai` package](/docs/ai/overview/install))_

The exchange will be very short: Pydantic AI will send the instructions and the user prompt to the LLM, and the model will return a text response.

Not very interesting yet, but we can easily add [tools](/docs/ai/tools-toolsets/tools), [dynamic instructions](/docs/ai/core-concepts/agent#instructions), [structured outputs](/docs/ai/core-concepts/output), or composable [capabilities](/docs/ai/core-concepts/capabilities) to build more powerful agents.

Here's the same agent with [thinking](/docs/ai/core-concepts/capabilities#thinking) and [web search](/docs/ai/core-concepts/capabilities#provider-adaptive-tools) capabilities:

hello\_world\_capabilities.py

```python
from pydantic_ai import Agent
from pydantic_ai.capabilities import Thinking, WebSearch

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    instructions='Be concise, reply with one sentence.',
    capabilities=[Thinking(), WebSearch()],
)

result = agent.run_sync('What was the mass of the largest meteorite found this year?')
print(result.output)
"""
The largest meteorite recovered this year weighed approximately 7.6 kg, found in the Sahara Desert in January.
"""
```

## Tools & Dependency Injection Example

Here is a concise example using Pydantic AI to build a support agent for a bank:

bank\_support.py

```python
from dataclasses import dataclass

from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext

from bank_database import DatabaseConn


@dataclass
class SupportDependencies:  # (3)
  customer_id: int
  db: DatabaseConn  # (12)


class SupportOutput(BaseModel):  # (13)
  support_advice: str = Field(description='Advice returned to the customer')
  block_card: bool = Field(description="Whether to block the customer's card")
  risk: int = Field(description='Risk level of query', ge=0, le=10)


support_agent = Agent(  # (1)
  'openai:gpt-5.2',  # (2)
  deps_type=SupportDependencies,
  output_type=SupportOutput,  # (9)
  instructions=(  # (4)
      'You are a support agent in our bank, give the '
      'customer support and judge the risk level of their query.'
  ),
)


@support_agent.instructions  # (5)
async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str:
  customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id)
  return f"The customer's name is {customer_name!r}"


@support_agent.tool  # (6)
async def customer_balance(
  ctx: RunContext[SupportDependencies], include_pending: bool
) -> float:
  """Returns the customer's current account balance."""  # (7)
  return await ctx.deps.db.customer_balance(
      id=ctx.deps.customer_id,
      include_pending=include_pending,
  )


...  # (11)


async def main():
  deps = SupportDependencies(customer_id=123, db=DatabaseConn())
  result = await support_agent.run('What is my balance?', deps=deps)  # (8)
  print(result.output)  # (10)
  """
  support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1
  """

  result = await support_agent.run('I just lost my card!', deps=deps)
  print(result.output)
  """
  support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8
  """
```

This [agent](/docs/ai/core-concepts/agent) will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of output they return. In this case, the support agent has type `Agent[SupportDependencies, SupportOutput]`.

Here we configure the agent to use [OpenAI's GPT-5 model](/docs/ai/models/openai), you can also set the model when running the agent.

The `SupportDependencies` dataclass is used to pass data, connections, and logic into the model that will be needed when running [instructions](/docs/ai/core-concepts/agent#instructions) and [tool](/docs/ai/tools-toolsets/tools) functions. Pydantic AI's system of dependency injection provides a [type-safe](/docs/ai/core-concepts/agent#static-type-checking) way to customise the behavior of your agents, and can be especially useful when running [unit tests](/docs/ai/guides/testing) and evals.

Static [instructions](/docs/ai/core-concepts/agent#instructions) can be registered with the [`instructions` keyword argument](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) to the agent.

Dynamic [instructions](/docs/ai/core-concepts/agent#instructions) can be registered with the [`@agent.instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.instructions) decorator, and can make use of dependency injection. Dependencies are carried via the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) argument, which is parameterized with the `deps_type` from above. If the type annotation here is wrong, static type checkers will catch it.

The [`@agent.tool`](/docs/ai/tools-toolsets/tools) decorator let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry.

The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are [extracted](/docs/ai/tools-toolsets/tools#function-tools-and-schema) from the docstring and added to the parameter schema sent to the LLM.

[Run the agent](/docs/ai/core-concepts/agent#running-agents) asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve an output.

The response from the agent will be guaranteed to be a `SupportOutput`. If validation fails [reflection](/docs/ai/core-concepts/agent#reflection-and-self-correction), the agent is prompted to try again.

The output will be validated with Pydantic to guarantee it is a `SupportOutput`, since the agent is generic, it'll also be typed as a `SupportOutput` to aid with static type checking.

In a real use case, you'd add more tools and longer instructions to the agent to extend the context it's equipped with and support it can provide.

This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers.

This [Pydantic](https://docs.pydantic.dev) model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run.

Complete `bank_support.py` example

The code included here is incomplete for the sake of brevity (the definition of `DatabaseConn` is missing); you can find the complete `bank_support.py` example [here](/docs/ai/examples/bank-support).

## Instrumentation with Pydantic Logfire

Even a simple agent with just a handful of tools can result in a lot of back-and-forth with the LLM, making it nearly impossible to be confident of what's going on just from reading the code. To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire.

To do this, we need to [set up Logfire](/docs/ai/integrations/logfire#using-logfire), and add the following to our code:

bank\_support\_with\_logfire.py

```python
...
from pydantic_ai import Agent, RunContext

from bank_database import DatabaseConn

import logfire

logfire.configure()  # (1)
logfire.instrument_pydantic_ai()  # (2)
logfire.instrument_sqlite3()  # (3)

...

support_agent = Agent(
  'openai:gpt-5.2',
  deps_type=SupportDependencies,
  output_type=SupportOutput,
  instructions=(
      'You are a support agent in our bank, give the '
      'customer support and judge the risk level of their query.'
  ),
)
```

Configure the Logfire SDK, this will fail if project is not set up.

This will instrument all Pydantic AI agents used from here on out. If you want to instrument only a specific agent, you can pass the [`instrument=True` keyword argument](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) to the agent.

In our demo, `DatabaseConn` uses [`sqlite3`](https://docs.python.org/3/library/sqlite3.html#module-sqlite3) to connect to a PostgreSQL database, so [`logfire.instrument_sqlite3()`](https://logfire.pydantic.dev/docs/integrations/databases/sqlite3/) is used to log the database queries.

That's enough to get the following view of your agent in action:

Logfire instrumentation for the bank agent -- [View in Logfire](https://logfire-eu.pydantic.dev/public-trace/a2957caa-b7b7-4883-a529-777742649004?spanId=31aade41ab896144)

See [Monitoring and Performance](/docs/ai/integrations/logfire) to learn more.

## `llms.txt`

The Pydantic AI documentation is available in the [llms.txt](https://llmstxt.org/) format. This format is defined in Markdown and suited for LLMs and AI coding assistants and agents.

Two formats are available:

-   [`llms.txt`](https://ai.pydantic.dev/llms.txt): a file containing a brief description of the project, along with links to the different sections of the documentation. The structure of this file is described in details [here](https://llmstxt.org/#format).
-   [`llms-full.txt`](https://ai.pydantic.dev/llms-full.txt): Similar to the `llms.txt` file, but every link content is included. Note that this file may be too large for some LLMs.

As of today, these files are not automatically leveraged by IDEs or coding agents, but they will use it if you provide a link or the full text.

## Next Steps

To try Pydantic AI for yourself, [install it](/docs/ai/overview/install) and follow the instructions [in the examples](/docs/ai/examples/setup).

Read the [docs](/docs/ai/core-concepts/agent) to learn more about building applications with Pydantic AI.

Read the [API Reference](/docs/ai/api/pydantic-ai/agent) to understand Pydantic AI's interface.

Join [Slack](https://logfire.pydantic.dev/docs/join-slack/) or file an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues) if you have any questions.

---

# [Installation](https://pydantic.dev/docs/ai/overview/install/)

# Installation

Pydantic AI is available on PyPI as [`pydantic-ai`](https://pypi.org/project/pydantic-ai/) so installation is as simple as:

-   [pip](#tab-panel-86)
-   [uv](#tab-panel-87)

Terminal

```bash
pip install pydantic-ai
```

Terminal

```bash
uv add pydantic-ai
```

(Requires Python 3.10+)

This installs the `pydantic_ai` package, core dependencies, and libraries required to use all the models included in Pydantic AI. If you want to install only those dependencies required to use a specific model, you can install the ["slim"](#slim-install) version of Pydantic AI.

## Use with Pydantic Logfire

Pydantic AI has an excellent (but completely optional) integration with [Pydantic Logfire](https://pydantic.dev/logfire) to help you view and understand agent runs.

Logfire comes included with `pydantic-ai` (but not the ["slim" version](#slim-install)), so you can typically start using it immediately by following the [Logfire setup docs](/docs/ai/integrations/logfire#using-logfire).

## Running Examples

We distribute the [`pydantic_ai_examples`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples) directory as a separate PyPI package ([`pydantic-ai-examples`](https://pypi.org/project/pydantic-ai-examples/)) to make examples extremely easy to customize and run.

To install examples, use the `examples` optional group:

-   [pip](#tab-panel-88)
-   [uv](#tab-panel-89)

Terminal

```bash
pip install "pydantic-ai[examples]"
```

Terminal

```bash
uv add "pydantic-ai[examples]"
```

To run the examples, follow instructions in the [examples docs](/docs/ai/examples/setup).

## Slim Install

If you know which model you're going to use and want to avoid installing superfluous packages, you can use the [`pydantic-ai-slim`](https://pypi.org/project/pydantic-ai-slim/) package. For example, if you're using just [`OpenAIChatModel`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIChatModel), you would run:

-   [pip](#tab-panel-90)
-   [uv](#tab-panel-91)

Terminal

```bash
pip install "pydantic-ai-slim[openai]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openai]"
```

`pydantic-ai-slim` has the following optional groups:

-   `logfire` -- installs [Pydantic Logfire](/docs/ai/integrations/logfire) dependency `logfire` [PyPI ↗](https://pypi.org/project/logfire)
-   `evals` -- installs [Pydantic Evals](/docs/ai/evals/) dependency `pydantic-evals` [PyPI ↗](https://pypi.org/project/pydantic-evals)
-   `openai` -- installs [OpenAI Model](/docs/ai/models/openai) dependency `openai` [PyPI ↗](https://pypi.org/project/openai)
-   `vertexai` -- installs `GoogleVertexProvider` dependencies `google-auth` [PyPI ↗](https://pypi.org/project/google-auth) and `requests` [PyPI ↗](https://pypi.org/project/requests)
-   `google` -- installs [Google Model](/docs/ai/models/google) dependency `google-genai` [PyPI ↗](https://pypi.org/project/google-genai)
-   `anthropic` -- installs [Anthropic Model](/docs/ai/models/anthropic) dependency `anthropic` [PyPI ↗](https://pypi.org/project/anthropic)
-   `groq` -- installs [Groq Model](/docs/ai/models/groq) dependency `groq` [PyPI ↗](https://pypi.org/project/groq)
-   `mistral` -- installs [Mistral Model](/docs/ai/models/mistral) dependency `mistralai` [PyPI ↗](https://pypi.org/project/mistralai)
-   `cohere` - installs [Cohere Model](/docs/ai/models/cohere) dependency `cohere` [PyPI ↗](https://pypi.org/project/cohere)
-   `bedrock` - installs [Bedrock Model](/docs/ai/models/bedrock) dependency `boto3` [PyPI ↗](https://pypi.org/project/boto3)
-   `huggingface` - installs [Hugging Face Model](/docs/ai/models/huggingface) dependency `huggingface-hub` [PyPI ↗](https://pypi.org/project/huggingface-hub)
-   `sentence-transformers` - installs [Sentence Transformers Embedding Model](/docs/ai/guides/embeddings#sentence-transformers-local) dependency `sentence-transformers` [PyPI ↗](https://pypi.org/project/sentence-transformers)
-   `voyageai` - installs [VoyageAI Embedding Model](/docs/ai/guides/embeddings#voyageai) dependency `voyageai` [PyPI ↗](https://pypi.org/project/voyageai)
-   `outlines-transformers` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[transformers]` [PyPI ↗](https://pypi.org/project/outlines)
-   `outlines-llamacpp` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[llamacpp]` [PyPI ↗](https://pypi.org/project/outlines)
-   `outlines-mlxlm` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[mlxlm]` [PyPI ↗](https://pypi.org/project/outlines)
-   `outlines-sglang` - installs [Outlines Model](/docs/ai/models/outlines) dependency `outlines[sglang]` [PyPI ↗](https://pypi.org/project/outlines)
-   `outlines-vllm-offline` - installs [Outlines Model](/docs/ai/models/outlines) dependencies `outlines` [PyPI ↗](https://pypi.org/project/outlines) and `vllm` [PyPI ↗](https://pypi.org/project/vllm)
-   `duckduckgo` - installs [DuckDuckGo Search Tool](/docs/ai/tools-toolsets/common-tools#duckduckgo-search-tool) dependency `ddgs` [PyPI ↗](https://pypi.org/project/ddgs)
-   `tavily` - installs [Tavily Search Tool](/docs/ai/tools-toolsets/common-tools#tavily-search-tool) dependency `tavily-python` [PyPI ↗](https://pypi.org/project/tavily-python)
-   `exa` - installs [Exa Search Tool](/docs/ai/tools-toolsets/common-tools#exa-search-tool) dependency `exa-py` [PyPI ↗](https://pypi.org/project/exa-py)
-   `web-fetch` - installs [Web Fetch Tool](/docs/ai/tools-toolsets/common-tools#web-fetch-tool) dependency `markdownify` [PyPI ↗](https://pypi.org/project/markdownify)
-   `cli` - installs [CLI](/docs/ai/integrations/cli) dependencies `rich` [PyPI ↗](https://pypi.org/project/rich), `prompt-toolkit` [PyPI ↗](https://pypi.org/project/prompt-toolkit), and `argcomplete` [PyPI ↗](https://pypi.org/project/argcomplete)
-   `mcp` - installs [MCP](/docs/ai/mcp/client) dependency `mcp` [PyPI ↗](https://pypi.org/project/mcp)
-   `fastmcp` - installs [FastMCP](/docs/ai/mcp/fastmcp-client) dependency `fastmcp` [PyPI ↗](https://pypi.org/project/fastmcp)
-   `a2a` - installs [A2A](/docs/ai/integrations/a2a) dependency `fasta2a` [PyPI ↗](https://pypi.org/project/fasta2a)
-   `ui` - installs [UI Event Streams](/docs/ai/integrations/ui/overview) dependency `starlette` [PyPI ↗](https://pypi.org/project/starlette)
-   `ag-ui` - installs [AG-UI Event Stream Protocol](/docs/ai/integrations/ui/ag-ui) dependencies `ag-ui-protocol` [PyPI ↗](https://pypi.org/project/ag-ui-protocol) and `starlette` [PyPI ↗](https://pypi.org/project/starlette)
-   `dbos` - installs [DBOS Durable Execution](/docs/ai/integrations/durable_execution/dbos) dependency `dbos` [PyPI ↗](https://pypi.org/project/dbos)
-   `prefect` - installs [Prefect Durable Execution](/docs/ai/integrations/durable_execution/prefect) dependency `prefect` [PyPI ↗](https://pypi.org/project/prefect)

You can also install dependencies for multiple models and use cases, for example:

-   [pip](#tab-panel-92)
-   [uv](#tab-panel-93)

Terminal

```bash
pip install "pydantic-ai-slim[openai,google,logfire]"
```

Terminal

```bash
uv add "pydantic-ai-slim[openai,google,logfire]"
```

---

# [Troubleshooting](https://pydantic.dev/docs/ai/overview/troubleshooting/)

# Troubleshooting

Below are suggestions on how to fix some common errors you might encounter while using Pydantic AI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the [Pydantic Slack](/docs/ai/overview/help) or create an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues).

## Jupyter Notebook Errors

### `RuntimeError: This event loop is already running`

**Modern Jupyter/IPython (7.0+)**: This environment supports top-level `await` natively. You can use `Agent.run()` directly in notebook cells without additional setup:

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')
result = await agent.run('Who let the dogs out?')
```

**Legacy environments or specific integrations**: If you encounter event loop conflicts, use [`nest-asyncio`](https://pypi.org/project/nest-asyncio/):

```python
import nest_asyncio

from pydantic_ai import Agent

nest_asyncio.apply()

agent = Agent('openai:gpt-5.2')
result = agent.run_sync('Who let the dogs out?')
```

**Note**: This also applies to Google Colab and [Marimo](https://github.com/marimo-team/marimo) environments.

## API Key Configuration

### `UserError: API key must be provided or set in the [MODEL]_API_KEY environment variable`

If you're running into issues with setting the API key for your model, visit the [Models](/docs/ai/models/overview) page to learn more about how to set an environment variable and/or pass in an `api_key` argument.

## Monitoring HTTPX Requests

You can use custom `httpx` clients in your models in order to access specific requests, responses, and headers at runtime.

It's particularly helpful to use `logfire`'s [HTTPX integration](/docs/ai/integrations/logfire#monitoring-http-requests) to monitor the above.

---

# [Upgrade Guide](https://pydantic.dev/docs/ai/project/changelog/)

# Upgrade Guide

In September 2025, Pydantic AI reached V1, which means we're committed to API stability: we will not introduce changes that break your code until V2. For more information, review our [Version Policy](/docs/ai/project/version-policy).

## Breaking Changes

Here's a filtered list of the breaking changes for each version to help you upgrade Pydantic AI.

### v1.0.1 (2025-09-05)

The following breaking change was accidentally left out of v1.0.0:

-   See [#2808](https://github.com/pydantic/pydantic-ai/pull/2808) - Remove `Python` evaluator from `pydantic_evals` for security reasons

### v1.0.0 (2025-09-04)

-   See [#2725](https://github.com/pydantic/pydantic-ai/pull/2725) - Drop support for Python 3.9
-   See [#2738](https://github.com/pydantic/pydantic-ai/pull/2738) - Make many dataclasses require keyword arguments
-   See [#2715](https://github.com/pydantic/pydantic-ai/pull/2715) - Remove `cases` and `averages` attributes from `pydantic_evals` spans
-   See [#2798](https://github.com/pydantic/pydantic-ai/pull/2798) - Change `ModelRequest.parts` and `ModelResponse.parts` types from `list` to `Sequence`
-   See [#2726](https://github.com/pydantic/pydantic-ai/pull/2726) - Default `InstrumentationSettings` version to 2
-   See [#2717](https://github.com/pydantic/pydantic-ai/pull/2717) - Remove errors when passing `AsyncRetrying` or `Retrying` object to `AsyncTenacityTransport` or `TenacityTransport` instead of `RetryConfig`

### v0.x.x

Before V1, minor versions were used to introduce breaking changes:

**v0.8.0 (2025-08-26)**

See [#2689](https://github.com/pydantic/pydantic-ai/pull/2689) - `AgentStreamEvent` was expanded to be a union of `ModelResponseStreamEvent` and `HandleResponseEvent`, simplifying the `event_stream_handler` function signature. Existing code accepting `AgentStreamEvent | HandleResponseEvent` will continue to work.

**v0.7.6 (2025-08-26)**

The following breaking change was inadvertently released in a patch version rather than a minor version:

See [#2670](https://github.com/pydantic/pydantic-ai/pull/2670) - `TenacityTransport` and `AsyncTenacityTransport` now require the use of `pydantic_ai.retries.RetryConfig` (which is just a `TypedDict` containing the kwargs to `tenacity.retry`) instead of `tenacity.Retrying` or `tenacity.AsyncRetrying`.

**v0.7.0 (2025-08-12)**

See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now yields a `FinalResultEvent` along with the existing `PartStartEvent` and `PartDeltaEvent`. If you're using `pydantic_ai.direct.model_request_stream` or `pydantic_ai.direct.model_request_stream_sync`, you may need to update your code to account for this.

See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.Model.request_stream` now receives a `run_context` argument. If you've implemented a custom `Model` subclass, you will need to account for this.

See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now requires a `model_request_parameters` field and constructor argument. If you've implemented a custom `Model` subclass and implemented `request_stream`, you will need to account for this.

**v0.6.0 (2025-08-06)**

This release was meant to clean some old deprecated code, so we can get a step closer to V1.

See [#2440](https://github.com/pydantic/pydantic-ai/pull/2440) - The `next` method was removed from the `Graph` class. Use `async with graph.iter(...) as run: run.next()` instead.

See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_type`, `result_tool_name` and `result_tool_description` arguments were removed from the `Agent` class. Use `output_type` instead.

See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_retries` argument was also removed from the `Agent` class. Use `output_retries` instead.

See [#2443](https://github.com/pydantic/pydantic-ai/pull/2443) - The `data` property was removed from the `FinalResult` class. Use `output` instead.

See [#2445](https://github.com/pydantic/pydantic-ai/pull/2445) - The `get_data` and `validate_structured_result` methods were removed from the `StreamedRunResult` class. Use `get_output` and `validate_structured_output` instead.

See [#2446](https://github.com/pydantic/pydantic-ai/pull/2446) - The `format_as_xml` function was moved to the `pydantic_ai.format_as_xml` module. Import it via `from pydantic_ai import format_as_xml` instead.

See [#2451](https://github.com/pydantic/pydantic-ai/pull/2451) - Removed deprecated `Agent.result_validator` method, `Agent.last_run_messages` property, `AgentRunResult.data` property, and `result_tool_return_content` parameters from result classes.

**v0.5.0 (2025-08-04)**

See [#2388](https://github.com/pydantic/pydantic-ai/pull/2388) - The `source` field of an `EvaluationResult` is now of type `EvaluatorSpec` rather than the actual source `Evaluator` instance, to help with serialization/deserialization.

See [#2163](https://github.com/pydantic/pydantic-ai/pull/2163) - The `EvaluationReport.print` and `EvaluationReport.console_table` methods now require most arguments be passed by keyword.

**v0.4.0 (2025-07-08)**

See [#1799](https://github.com/pydantic/pydantic-ai/pull/1799) - Pydantic Evals `EvaluationReport` and `ReportCase` are now generic dataclasses instead of Pydantic models. If you were serializing them using `model_dump()`, you will now need to use the `EvaluationReportAdapter` and `ReportCaseAdapter` type adapters instead.

See [#1507](https://github.com/pydantic/pydantic-ai/pull/1507) - The `ToolDefinition` `description` argument is now optional and the order of positional arguments has changed from `name, description, parameters_json_schema, ...` to `name, parameters_json_schema, description, ...` to account for this.

**v0.3.0 (2025-06-18)**

See [#1142](https://github.com/pydantic/pydantic-ai/pull/1142) -- Adds support for thinking parts.

We now convert the thinking blocks (`"<think>..."</think>"`) in provider specific text parts to Pydantic AI `ThinkingPart`s. Also, as part of this release, we made the choice to not send back the `ThinkingPart`s to the provider - the idea is to save costs on behalf of the user. In the future, we intend to add a setting to customize this behavior.

**v0.2.0 (2025-05-12)**

See [#1647](https://github.com/pydantic/pydantic-ai/pull/1647) -- usage makes sense as part of `ModelResponse`, and could be really useful in "messages" (really a sequence of requests and response). In this PR:

-   Adds `usage` to `ModelResponse` (field has a default factory of `Usage()` so it'll work to load data that doesn't have usage)
-   changes the return type of `Model.request` to just `ModelResponse` instead of `tuple[ModelResponse, Usage]`

**v0.1.0 (2025-04-15)**

See [#1248](https://github.com/pydantic/pydantic-ai/pull/1248) -- the attribute/parameter name `result` was renamed to `output` in many places. Hopefully all changes keep a deprecated attribute or parameter with the old name, so you should get many deprecation warnings.

See [#1484](https://github.com/pydantic/pydantic-ai/pull/1484) -- `format_as_xml` was moved and made available to import from the package root, e.g. `from pydantic_ai import format_as_xml`.

## Full Changelog

For the full changelog, see [GitHub Releases](https://github.com/pydantic/pydantic-ai/releases).

---

# [Contributing](https://pydantic.dev/docs/ai/project/contributing/)

# Contributing

We'd love you to contribute to Pydantic AI!

## How we work -- the short version

Pydantic AI is maintained by a small team. We set our own priorities based on what benefits the most users, and we work through issues and PRs in that order -- not in the order they arrive.

-   **Found a bug?** Open an issue with a clear description and a minimal reproducible example. Including a [Logfire](https://logfire.pydantic.dev/) trace link helps us debug dramatically faster.
-   **Want a feature or API change?** Open an issue describing the problem you're solving. Do not start with code.
-   **Want to help build a feature?** Comment on the issue explaining why you need it and what context you bring. We call this being a "champion" -- more on that below.
-   **Have a fix or code to share?** Make sure a maintainer has agreed to the approach on the issue and assigned you. Then open a PR.

The rest of this page explains why we work this way and what to expect.

## Before you write code

For anything non-trivial, align with a maintainer on the approach before writing code. A pre-aligned PR is much faster to land than one we're seeing cold.

### Trivial fixes

Typos, broken links, small doc improvements, obvious one-line fixes: just open a PR. No issue needed.

### Bug fixes

If the fix could reasonably go more than one way, or you're unsure it's actually a bug: open an issue first. Include a minimal reproducible example and ideally a [Logfire trace link](https://logfire.pydantic.dev/) showing the problem. For well-scoped bugs, we may generate a fix internally -- the most valuable thing you can do is file a clear report and then validate that the fix works for your use case.

### Features, integrations, or API changes

Before writing code, ask whether the change needs to live in core at all. Most new agent behaviors belong in [**Pydantic AI Harness**](https://github.com/pydantic/pydantic-ai-harness), the official capability library -- not in this repo. Pydantic AI core is for the agent loop, model providers, and capabilities that require model-specific support or are fundamental to the agent experience. Standalone capabilities -- guardrails, memory, context management, file system access, etc. -- belong in the harness, where they can iterate faster. See [What goes where?](/docs/ai/harness/overview#what-goes-where) for the full distinction.

**If your idea is a capability**, open an issue on [pydantic-ai-harness](https://github.com/pydantic/pydantic-ai-harness/issues) instead. You can also publish capabilities as your own package using the `pydantic-ai-<name>` convention -- see [Publishing capability packages](/docs/ai/guides/extensibility#publishing-capability-packages). Once a capability has real users and a stable shape, we can talk about upstreaming to harness or core.

If it does belong in core:

1.  **Search first.** If an existing issue covers your need, comment there. If the closest match is only related, open a new issue and link it.
2.  **Describe the problem, not just the solution.** Tell us what you're building, what's blocking you, and what you've tried. This context matters more than code.
3.  **Propose a plan before building.** Post the shape of the solution on the issue, or open a draft PR with just a `PLAN.md`. For larger features, we do short video calls with contributors to iterate on the design -- a 20-minute call often saves weeks of async review cycles.
4.  **Wait for assignment.** A maintainer needs to agree on the approach and assign the issue to you before you open a PR. Unassigned PRs may be auto-closed.

Caution

Writing a large feature PR without prior alignment is the most common way for a contribution to stall or be closed.

## Champions

A "champion" is someone who needs a feature, has context on the problem, and is willing to invest time to help us get it right. If you want to champion a feature:

-   Comment on the issue explaining: what you're building, why you need this, and what you can contribute (domain knowledge, testing, validation).
-   We prioritize features where one or more champions with production use cases have stepped up. A feature with no champion stays in the backlog until either we prioritize it ourselves or someone with real context shows up.
-   Being a champion doesn't mean writing the code. It means shaping the plan and validating the result. For significant features, we'll set up a call to iterate on the design together.

Champions are credited as co-authors when the feature ships.

## What to expect during review

### We review PRs in our priority order, not submission order

We do not automatically triage every new PR. PRs on issues we have not pre-aligned on are not in our review queue, regardless of how well written they are. If no maintainer has agreed to the change on an issue and assigned it to you, assume we have not seen your PR.

Even for PRs with code we've previously engaged with: we treat all contributed code as a starting point, not a finished product. We review and prioritize PRs based on the feature's importance to the project, not on how much effort went into the code. This is a change from how open source traditionally worked, and we'd rather be honest about it than leave PRs sitting with no signal.

**If you want to know where your PR stands**, the best thing to do is ping `#pydantic-ai` on [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/).

### We may rewrite or supersede your code

We treat contributed code as illustrative: a starting point that shows the shape of the change and proves the approach works, not the final form we merge. The most useful thing you can give us for a non-trivial change is a plan plus a working example -- not a polished, merge-ready implementation.

On any PR, we may push commits to your branch, open a follow-up PR that supersedes yours, or rewrite from scratch. For security reasons, we lean toward rewriting contributed code rather than merging as-is. You will still be credited as the original author.

Please don't spend effort chasing green CI, addressing every automated review comment, or rebasing for merge conflicts on a PR we haven't pre-aligned on. If we take the change forward, that polish gets thrown away when we rewrite. Get the approach working, then stop and ping us on Slack.

### Automated review is advisory, not a gate

PRs are automatically reviewed by Devin and our own tooling. These reviews are advisory:

-   A bot approval does not mean your PR is ready to merge. Only a human maintainer's review counts.
-   A bot finding does not mean you must act on it. If you disagree, say so.
-   If automated review is generating noise on your PR, tell us. We use that feedback to retune the tooling.

### Priority

We receive far more contributions than we can review, and we focus where it has the most impact. We cannot promise to get to every PR, even good ones, and we'd rather say so up front than leave your work open indefinitely with no signal.

How we weigh priorities:

-   **User demand** -- features that more users need get priority. Champion-backed features with production use cases outrank speculative additions.
-   **Provider significance** -- work that affects frontier providers (Anthropic, OpenAI, Google) or providers we know are heavily used gets priority. A model integration for a niche provider will wait; a fix for Anthropic won't.
-   **Roadmap alignment** -- features that align with our current focus areas get priority. Right now that includes the capabilities/hooks API, provider-adaptive tools, and the [Pydantic AI Harness](https://github.com/pydantic/pydantic-ai-harness) capability library.
-   **Capabilities over core** -- features that could live as a [capability](/docs/ai/core-concepts/capabilities) should go to [Pydantic AI Harness](https://github.com/pydantic/pydantic-ai-harness) or ship as your own package -- that's often the fastest path. Once it has traction, come back and we can talk about upstreaming.

## If your PR or issue has gone quiet

1.  Ping `#pydantic-ai` on [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) with a link.
2.  Say what you need: "Can you take a look?", "I'm blocked -- is this on your radar?", or "Should I close this?" are all fine.
3.  If you've been waiting weeks without any human response, flag it. That's a process failure on our side and we want to know.

## Installation and Setup

Clone your fork and cd into the repo directory

Terminal

```bash
git clone git@github.com:<your username>/pydantic-ai.git
cd pydantic-ai
```

Install `uv` (version 0.4.30 or later) and `pre-commit`:

-   [`uv` install docs](https://docs.astral.sh/uv/getting-started/installation/)
-   [`pre-commit` install docs](https://pre-commit.com/#install)

To install `pre-commit` you can run the following command:

Terminal

```bash
uv tool install pre-commit
```

Install `pydantic-ai`, all dependencies and pre-commit hooks

Terminal

```bash
make install
```

## Running Tests etc.

We use `make` to manage most commands you'll need to run.

For details on available commands, run:

Terminal

```bash
make help
```

To run code formatting, linting, static type checks, and tests with coverage report generation, run:

Terminal

```bash
make
```

## Documentation Changes

To run the documentation page locally, run:

Terminal

```bash
uv run mkdocs serve
```

## Rules for adding new models to Pydantic AI

To avoid an excessive workload for the maintainers of Pydantic AI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work.

-   To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more
-   To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total
-   For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use
-   For any other model that requires more logic, we recommend you release your own Python package `pydantic-ai-xxx`, which depends on [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) and implements a model that inherits from our [`Model`](/docs/ai/api/models/base/#pydantic_ai.models.Model) ABC

If you're unsure about adding a model, please [create an issue](https://github.com/pydantic/pydantic-ai/issues).

---

# [Version Policy](https://pydantic.dev/docs/ai/project/version-policy/)

# Version Policy

## Version Policy

We will not intentionally make breaking changes in minor releases of V1. V2 will be released in April 2026 at the earliest, 6 months after the release of V1 in September 2025.

Once we release V2, we'll continue to provide security fixes for V1 for another 6 months minimum, so you have time to upgrade your applications.

Functionality marked as deprecated will not be removed until V2.

Of course, some apparently safe changes and bug fixes will inevitably break some users' code -- obligatory link to [xkcd](https://xkcd.com/1172/).

The following changes will **NOT** be considered breaking changes, and may occur in minor releases:

-   Bug fixes that may result in existing code breaking, provided that such code was relying on undocumented features/constructs/assumptions.
-   Adding new [message parts](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages), [stream events](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.AgentStreamEvent), or optional fields (including fields with default values) on existing message (part) and event types. Always code defensively when consuming message parts or event streams, and use the [`ModelMessagesTypeAdapter`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelMessagesTypeAdapter) to (de)serialize message histories.
-   Changing OpenTelemetry span attributes. Because different [observability platforms](/docs/ai/integrations/logfire#using-opentelemetry) support different versions of the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), Pydantic AI lets you configure the [instrumentation version](/docs/ai/integrations/logfire#configuring-data-format), but the default version may change in a minor release. Span attributes for [Pydantic Evals](/docs/ai/evals/) may also change as we iterate on Evals support in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/).
-   Changing how `__repr__` behaves, even of public classes.

In all cases we will aim to minimize churn and do so only when justified by the increase of quality of Pydantic AI for users.

## Beta Features

At Pydantic, we like to move quickly and innovate! To that end, minor releases may introduce beta features (indicated by a `beta` module) that are active works in progress. While in its beta phase, a feature's API and behaviors may not be stable, and it's very possible that changes made to the feature will not be backward-compatible. We aim to move beta features out of beta within a few months after initial release, once users have had a chance to provide feedback and test the feature in production.

## Support for Python versions

Pydantic will drop support for a Python version when the following conditions are met:

-   The Python version has reached its [expected end of life](https://devguide.python.org/versions/).
-   less than 5% of downloads of the most recent minor release are using that version.

---

# [Built-in Tools](https://pydantic.dev/docs/ai/tools-toolsets/builtin-tools/)

# Built-in Tools

Built-in tools are native tools provided by LLM providers that can be used to enhance your agent's capabilities. Unlike [common tools](/docs/ai/tools-toolsets/common-tools), which are custom implementations that Pydantic AI executes, built-in tools are executed directly by the model provider.

Pydantic AI supports the following built-in tools:

-   **[`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool)**: Allows agents to search the web
-   **[`XSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.XSearchTool)**: Allows agents to search X/Twitter (xAI only)
-   **[`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool)**: Enables agents to execute code in a secure environment
-   **[`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool)**: Enables agents to generate images
-   **[`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool)**: Enables agents to fetch web pages
-   **[`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool)**: Enables agents to use memory
-   **[`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool)**: Enables agents to use remote MCP servers with communication handled by the model provider
-   **[`FileSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.FileSearchTool)**: Enables agents to search through uploaded files using vector search (RAG)

These tools are passed to the agent via the `builtin_tools` parameter and are executed by the model provider's infrastructure.

Provider Support

Not all model providers support built-in tools. If you use a built-in tool with an unsupported provider, Pydantic AI will raise a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) when you try to run the agent.

If a provider supports a built-in tool that is not currently supported by Pydantic AI, please file an issue.

Provider-adaptive capabilities

For a higher-level, model-agnostic approach, consider the [provider-adaptive tool capabilities](/docs/ai/core-concepts/capabilities#provider-adaptive-tools): [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch), [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch), [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration), and [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP). These automatically use the model's native tool when supported and fall back to a local implementation, so your agent works across providers without code changes.

## Dynamic Configuration

Sometimes you need to configure a built-in tool dynamically based on the [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g., user dependencies), or conditionally omit it. You can achieve this by passing a function to `builtin_tools` that takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as an argument and returns an [`AbstractBuiltinTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.AbstractBuiltinTool) or `None`.

This is particularly useful for tools like [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool) where you might want to set the user's location based on the current request, or disable the tool if the user provides no location.

dynamic\_builtin\_tool.py

```python
from pydantic_ai import Agent, RunContext, WebSearchTool


async def prepared_web_search(ctx: RunContext[dict]) -> WebSearchTool | None:
    if not ctx.deps.get('location'):
        return None

    return WebSearchTool(
        user_location={'city': ctx.deps['location']},
    )

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[prepared_web_search],
    deps_type=dict,
)

# Run with location
result = agent.run_sync(
    'What is the weather like?',
    deps={'location': 'London'},
)
print(result.output)
#> It's currently raining in London.

# Run without location (tool will be omitted)
result = agent.run_sync(
    'What is the capital of France?',
    deps={'location': None},
)
print(result.output)
#> The capital of France is Paris.
```

## Web Search Tool

Tip

For a model-agnostic approach with automatic local fallback, see the [`WebSearch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebSearch) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools).

The [`WebSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebSearchTool) allows your agent to search the web, making it ideal for queries that require up-to-date data.

### Provider Support

Provider

Supported

Notes

OpenAI Responses

✅

Full feature support. To include search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_web_search_sources`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_web_search_sources) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

Anthropic

✅

Full feature support

Google

✅

No parameter support. No [`BuiltinToolCallPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolCallPart) or [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) is generated when streaming. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead.

xAI

✅

Supports `blocked_domains` and `allowed_domains` parameters.

Groq

✅

Limited parameter support. To use web search capabilities with Groq, you need to use the [compound models](https://console.groq.com/docs/compound).

OpenRouter

✅

Web search via [plugins](https://openrouter.ai/docs/features/web-search). Supports `search_context_size`. Uses native search for supported providers (OpenAI, Anthropic, Perplexity, xAI), Exa for others.

OpenAI Chat Completions

❌

Not supported

Bedrock

❌

Not supported

Mistral

❌

Not supported

Cohere

❌

Not supported

HuggingFace

❌

Not supported

Outlines

❌

Not supported

### Usage

web\_search\_anthropic.py

```python
from pydantic_ai import Agent, WebSearchTool

agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[WebSearchTool()])

result = agent.run_sync('Give me a sentence with the biggest news in AI this week.')
print(result.output)
#> Scientists have developed a universal AI detector that can identify deepfake videos.
```

_(This example is complete, it can be run "as is")_

With OpenAI, you must use their Responses API to access the web search tool.

web\_search\_openai.py

```python
from pydantic_ai import Agent, WebSearchTool

agent = Agent('openai-responses:gpt-5.2', builtin_tools=[WebSearchTool()])

result = agent.run_sync('Give me a sentence with the biggest news in AI this week.')
print(result.output)
#> Scientists have developed a universal AI detector that can identify deepfake videos.
```

_(This example is complete, it can be run "as is")_

### Configuration Options

The `WebSearchTool` supports several configuration parameters:

web\_search\_configured.py

```python
from pydantic_ai import Agent, WebSearchTool, WebSearchUserLocation

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    builtin_tools=[
        WebSearchTool(
            search_context_size='high',
            user_location=WebSearchUserLocation(
                city='San Francisco',
                country='US',
                region='CA',
                timezone='America/Los_Angeles',
            ),
            blocked_domains=['example.com', 'spam-site.net'],
            allowed_domains=None,  # Cannot use both blocked_domains and allowed_domains with Anthropic
            max_uses=5,  # Anthropic only: limit tool usage
        )
    ],
)

result = agent.run_sync('Use the web to get the current time.')
print(result.output)
#> In San Francisco, it's 8:21:41 pm PDT on Wednesday, August 6, 2025.
```

_(This example is complete, it can be run "as is")_

#### Provider Support

Parameter

OpenAI

Anthropic

xAI

Groq

OpenRouter

`search_context_size`

✅

❌

❌

❌

✅

`user_location`

✅

✅

❌

❌

❌

`blocked_domains`

❌

✅

✅

✅

❌

`allowed_domains`

✅

✅

✅

✅

❌

`max_uses`

❌

✅

❌

❌

❌

Anthropic Domain Filtering

With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both.

## X Search Tool

The [`XSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.XSearchTool) allows your agent to search X/Twitter for real-time posts and content. This tool is exclusive to xAI models. See the [xAI X Search documentation](https://docs.x.ai/developers/tools/x-search) for more details.

### Usage

x\_search\_xai.py

```py
from pydantic_ai import Agent, XSearchTool

agent = Agent('xai:grok-4-1-fast', builtin_tools=[XSearchTool()])

result = agent.run_sync('What are people saying about AI on X today?')
print(result.output)
#> There's a lot of excitement about new AI models being released...
```

_(This example is complete, it can be run "as is")_

### Configuration Options

The `XSearchTool` supports several configuration parameters:

x\_search\_configured.py

```py
from datetime import datetime

from pydantic_ai import Agent, XSearchTool

agent = Agent(
    'xai:grok-4-1-fast',
    builtin_tools=[
        XSearchTool(
            allowed_x_handles=['OpenAI', 'AnthropicAI', 'dasfacc'],
            from_date=datetime(2024, 1, 1),
            to_date=datetime(2024, 12, 31),
            enable_image_understanding=True,
            enable_video_understanding=True,
        )
    ],
)

result = agent.run_sync('What have AI companies been posting about?')
print(result.output)
"""
OpenAI announced their latest model updates, while Anthropic shared research on AI safety...
"""
```

_(This example is complete, it can be run "as is")_

Handle Filtering

You can only use one of `allowed_x_handles` or `excluded_x_handles`, not both. Each list is limited to 10 handles maximum.

Including raw search results

By default, xAI only returns the model's text summary of the search. To get programmatic access to the underlying posts, sources, and metadata, set `include_x_search_output=True` on [`XSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.XSearchTool) (analogous to [`OpenAIResponsesModelSettings.openai_include_web_search_sources`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_web_search_sources) for OpenAI web search). The raw results are then available on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) exposed via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls). As an alternative, you can enable it globally via the [`XaiModelSettings.xai_include_x_search_output`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_x_search_output) [model setting](/docs/ai/core-concepts/agent#model-run-settings). See the [xAI docs](/docs/ai/models/xai#x-search) for the recommended `XSearch` capability-based approach.

## Code Execution Tool

The [`CodeExecutionTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.CodeExecutionTool) enables your agent to execute code in a secure environment, making it perfect for computational tasks, data analysis, and mathematical operations.

### Provider Support

Provider

Supported

Notes

OpenAI

✅

To include code execution output on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_code_execution_outputs`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_code_execution_outputs) [model setting](/docs/ai/core-concepts/agent#model-run-settings). If the code execution generated images, like charts, they will be available on [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) as [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage) objects. The generated image can also be used as [image output](/docs/ai/core-concepts/output#image-output) for the agent run.

Google

✅

Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead.

Anthropic

✅

Available on compatible Anthropic models. Pydantic AI selects a compatible code execution tool version automatically; see [Anthropic code execution tool version](/docs/ai/models/anthropic#code-execution-tool-version) to override it.

xAI

✅

Full feature support.

Groq

❌

Bedrock

✅

Only available for Nova 2.0 models.

Mistral

❌

Cohere

❌

HuggingFace

❌

Outlines

❌

### Usage

code\_execution\_basic.py

```python
from pydantic_ai import Agent, CodeExecutionTool

agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[CodeExecutionTool()])

result = agent.run_sync('Calculate the factorial of 15.')
print(result.output)
#> The factorial of 15 is **1,307,674,368,000**.
print(result.response.builtin_tool_calls)
"""
[
    (
        BuiltinToolCallPart(
            tool_name='code_execution',
            args={'command': 'python3 -c "import math; print(math.factorial(15))"'},
            tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ',
            provider_name='anthropic',
            provider_details={'anthropic_tool_name': 'bash_code_execution'},
        ),
        BuiltinToolReturnPart(
            tool_name='code_execution',
            content={
                'content': [],
                'return_code': 0,
                'stderr': '',
                'stdout': '1307674368000\n',
                'type': 'bash_code_execution_result',
            },
            tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ',
            timestamp=datetime.datetime(...),
            provider_name='anthropic',
            provider_details={'anthropic_tool_name': 'bash_code_execution'},
        ),
    )
]
"""
```

_(This example is complete, it can be run "as is")_

In addition to text output, code execution with OpenAI can generate images as part of their response. Accessing this image via [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) or [image output](/docs/ai/core-concepts/output#image-output) requires the [`OpenAIResponsesModelSettings.openai_include_code_execution_outputs`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_code_execution_outputs) [model setting](/docs/ai/core-concepts/agent#model-run-settings) to be enabled.

code\_execution\_openai.py

```python
from pydantic_ai import Agent, BinaryImage, CodeExecutionTool
from pydantic_ai.models.openai import OpenAIResponsesModelSettings

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[CodeExecutionTool()],
    output_type=BinaryImage,
    model_settings=OpenAIResponsesModelSettings(openai_include_code_execution_outputs=True),
)

result = agent.run_sync('Generate a chart of y=x^2 for x=-5 to 5.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

## Image Generation Tool

Tip

For a model-agnostic approach with automatic local fallback, see the [`ImageGeneration`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ImageGeneration) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools).

The [`ImageGenerationTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool) enables your agent to generate images.

### Provider Support

Provider

Supported

Notes

OpenAI Responses

✅

Full feature support. Only supported by models newer than `gpt-5.2`. Metadata about the generated image, like the [`revised_prompt`](https://platform.openai.com/docs/guides/tools-image-generation#revised-prompt) sent to the underlying image model, is available on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) that's available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls).

Google

✅

Limited parameter support. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-3-pro-image-preview` and `gemini-3-pro-image-preview`. These models do not support [function tools](/docs/ai/tools-toolsets/tools) and will always have the option of generating images, even if this built-in tool is not explicitly specified.

Anthropic

❌

xAI

❌

Groq

❌

Bedrock

❌

Mistral

❌

Cohere

❌

HuggingFace

❌

### Usage

Generated images are available on [`ModelResponse.images`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.images) as [`BinaryImage`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BinaryImage) objects:

image\_generation\_openai.py

```python
from pydantic_ai import Agent, BinaryImage, ImageGenerationTool

agent = Agent('openai-responses:gpt-5.2', builtin_tools=[ImageGenerationTool()])

result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.')
print(result.output)
"""
Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes!
"""

assert isinstance(result.response.images[0], BinaryImage)
```

_(This example is complete, it can be run "as is")_

Image generation with Google [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) does not require the `ImageGenerationTool` built-in tool to be explicitly specified:

image\_generation\_google.py

```python
from pydantic_ai import Agent, BinaryImage

agent = Agent('google-gla:gemini-3-pro-image-preview')

result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.')
print(result.output)
"""
Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes!
"""

assert isinstance(result.response.images[0], BinaryImage)
```

_(This example is complete, it can be run "as is")_

The `ImageGenerationTool` can be used together with `output_type=BinaryImage` to get [image output](/docs/ai/core-concepts/output#image-output). If the `ImageGenerationTool` built-in tool is not explicitly specified, it will be enabled automatically:

image\_generation\_output.py

```python
from pydantic_ai import Agent, BinaryImage

agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage)

result = agent.run_sync('Generate an image of an axolotl.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

### Configuration Options

The `ImageGenerationTool` supports several configuration parameters:

image\_generation\_configured.py

```python
from pydantic_ai import Agent, BinaryImage, ImageGenerationTool

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[
        ImageGenerationTool(
            action='generate',
            background='transparent',
            input_fidelity='high',
            model='gpt-image-2',
            moderation='low',
            output_compression=100,
            output_format='png',
            partial_images=3,
            quality='high',
            size='1024x1024',
        )
    ],
    output_type=BinaryImage,
)

result = agent.run_sync('Generate an image of an axolotl.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

OpenAI Responses models also respect the `aspect_ratio` parameter. Because the OpenAI API only exposes discrete image sizes, Pydantic AI maps `'1:1'` -> `1024x1024`, `'2:3'` -> `1024x1536`, and `'3:2'` -> `1536x1024`. Providing any other aspect ratio results in an error, and if you also set `size` it must match the computed value.

The OpenAI Responses image generation tool defaults to `action='auto'`, where the model decides whether to generate a new image or edit one already in context. Use `action='generate'` or `action='edit'` to force either behavior. You can also set `model` to select the underlying image generation model used by the tool, for example `model='gpt-image-2'`; this does not change the agent's conversational model.

To control the aspect ratio when using Gemini image models, include the `ImageGenerationTool` explicitly:

image\_generation\_google\_aspect\_ratio.py

```python
from pydantic_ai import Agent, BinaryImage, ImageGenerationTool

agent = Agent(
    'google-gla:gemini-3-pro-image-preview',
    builtin_tools=[ImageGenerationTool(aspect_ratio='16:9')],
    output_type=BinaryImage,
)

result = agent.run_sync('Generate a wide illustration of an axolotl city skyline.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

To control the image resolution with Google image generation models (starting with Gemini 3 Pro Image), use the `size` parameter:

image\_generation\_google\_resolution.py

```python
from pydantic_ai import Agent, BinaryImage, ImageGenerationTool

agent = Agent(
    'google-gla:gemini-3-pro-image-preview',
    builtin_tools=[ImageGenerationTool(aspect_ratio='16:9', size='4K')],
    output_type=BinaryImage,
)

result = agent.run_sync('Generate a high-resolution wide landscape illustration of an axolotl.')
assert isinstance(result.output, BinaryImage)
```

_(This example is complete, it can be run "as is")_

For more details, check the [API documentation](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.ImageGenerationTool).

#### Provider Support

Parameter

OpenAI

Google

`action`

✅ (auto (default), generate, edit)

❌

`background`

✅

❌

`input_fidelity`

✅

❌

`moderation`

✅

❌

`model`

✅ (gpt-image-2, gpt-image-1.5, gpt-image-1, gpt-image-1-mini, or another OpenAI image model ID)

❌

`output_compression`

✅ (100 (default), jpeg or webp only)

✅ (75 (default), jpeg only, Vertex AI only)

`output_format`

✅

✅ (Vertex AI only)

`partial_images`

✅

❌

`quality`

✅

❌

`size`

✅ (auto (default), 1024x1024, 1024x1536, 1536x1024)

✅ (512, 1K (default), 2K, 4K)

`aspect_ratio`

✅ (1:1, 2:3, 3:2)

✅ (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9)

Notes

-   **OpenAI**: `auto` lets the model select the value.
-   **Google (Vertex AI)**: Setting `output_compression` will default `output_format` to `jpeg` if not specified.

## Web Fetch Tool

Tip

For a model-agnostic approach with automatic local fallback, see the [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools).

The [`WebFetchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.WebFetchTool) enables your agent to pull URL contents into its context, allowing it to pull up-to-date information from the web.

### Provider Support

Provider

Supported

Notes

Anthropic

✅

Full feature support. Uses Anthropic's [Web Fetch Tool](https://docs.claude.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) internally to retrieve URL contents.

Google

✅

No parameter support. The limits are fixed at 20 URLs per request with a maximum of 34MB per URL. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead.

xAI

❌

Web browsing is implemented as part of [`WebSearchTool`](#web-search-tool) with xAI.

OpenAI

❌

Groq

❌

Bedrock

❌

Mistral

❌

Cohere

❌

HuggingFace

❌

Outlines

❌

### Usage

web\_fetch\_basic.py

```python
from pydantic_ai import Agent, WebFetchTool

agent = Agent('google-gla:gemini-3-flash-preview', builtin_tools=[WebFetchTool()])

result = agent.run_sync('What is this? https://ai.pydantic.dev')
print(result.output)
#> A Python agent framework for building Generative AI applications.
```

_(This example is complete, it can be run "as is")_

### Configuration Options

The `WebFetchTool` supports several configuration parameters:

web\_fetch\_configured.py

```python
from pydantic_ai import Agent, WebFetchTool

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    builtin_tools=[
        WebFetchTool(
            allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'],
            max_uses=10,
            enable_citations=True,
            max_content_tokens=50000,
        )
    ],
)

result = agent.run_sync(
    'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev'
)
print(result.output)
"""
Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience.
"""
```

_(This example is complete, it can be run "as is")_

#### Provider Support

Parameter

Anthropic

Google

`max_uses`

✅

❌

`allowed_domains`

✅

❌

`blocked_domains`

✅

❌

`enable_citations`

✅

❌

`max_content_tokens`

✅

❌

Anthropic Domain Filtering

With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both.

## Memory Tool

The [`MemoryTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MemoryTool) enables your agent to use memory.

### Provider Support

Provider

Supported

Notes

Anthropic

✅

Requires a tool named `memory` to be defined that implements [specific sub-commands](https://docs.claude.com/en/docs/agents-and-tools/tool-use/memory-tool#tool-commands). You can use a subclass of [`anthropic.lib.tools.BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) as documented below.

Google

❌

OpenAI

❌

Groq

❌

Bedrock

❌

Mistral

❌

Cohere

❌

HuggingFace

❌

### Usage

The Anthropic SDK provides an abstract [`BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) class that you can subclass to create your own memory storage solution (e.g., database, cloud storage, encrypted files, etc.). Their [`LocalFilesystemMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py) example can serve as a starting point.

The following example uses a subclass that hard-codes a specific memory. The bits specific to Pydantic AI are the `MemoryTool` built-in tool and the `memory` tool definition that forwards commands to the `call` method of the `BetaAbstractMemoryTool` subclass.

anthropic\_memory.py

```python
from typing import Any

from anthropic.lib.tools import BetaAbstractMemoryTool
from anthropic.types.beta import (
    BetaMemoryTool20250818CreateCommand,
    BetaMemoryTool20250818DeleteCommand,
    BetaMemoryTool20250818InsertCommand,
    BetaMemoryTool20250818RenameCommand,
    BetaMemoryTool20250818StrReplaceCommand,
    BetaMemoryTool20250818ViewCommand,
)

from pydantic_ai import Agent, MemoryTool


class FakeMemoryTool(BetaAbstractMemoryTool):
    def view(self, command: BetaMemoryTool20250818ViewCommand) -> str:
        return 'The user lives in Mexico City.'

    def create(self, command: BetaMemoryTool20250818CreateCommand) -> str:
        return f'File created successfully at {command.path}'

    def str_replace(self, command: BetaMemoryTool20250818StrReplaceCommand) -> str:
        return f'File {command.path} has been edited'

    def insert(self, command: BetaMemoryTool20250818InsertCommand) -> str:
        return f'Text inserted at line {command.insert_line} in {command.path}'

    def delete(self, command: BetaMemoryTool20250818DeleteCommand) -> str:
        return f'File deleted: {command.path}'

    def rename(self, command: BetaMemoryTool20250818RenameCommand) -> str:
        return f'Renamed {command.old_path} to {command.new_path}'

    def clear_all_memory(self) -> str:
        return 'All memory cleared'

fake_memory = FakeMemoryTool()

agent = Agent('anthropic:claude-sonnet-4-6', builtin_tools=[MemoryTool()])


@agent.tool_plain
def memory(**command: Any) -> Any:
    return fake_memory.call(command)


result = agent.run_sync('Remember that I live in Mexico City')
print(result.output)
"""
Got it! I've recorded that you live in Mexico City. I'll remember this for future reference.
"""

result = agent.run_sync('Where do I live?')
print(result.output)
#> You live in Mexico City.
```

_(This example is complete, it can be run "as is")_

## MCP Server Tool

Tip

For a model-agnostic approach with automatic local fallback, see the [`MCP`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.MCP) [capability](/docs/ai/core-concepts/capabilities#provider-adaptive-tools).

The [`MCPServerTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.MCPServerTool) allows your agent to use remote MCP servers with communication handled by the model provider.

This requires the MCP server to live at a public URL the provider can reach and does not support many of the advanced features of Pydantic AI's agent-side [MCP support](/docs/ai/mcp/client), but can result in optimized context use and caching, and faster performance due to the lack of a round-trip back to Pydantic AI.

### Provider Support

Provider

Supported

Notes

OpenAI Responses

✅

Full feature support. [Connectors](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) can be used by specifying a special `x-openai-connector:<connector_id>` URL.

Anthropic

✅

Full feature support

xAI

✅

Full feature support

Google

❌

Not supported

Groq

❌

Not supported

OpenAI Chat Completions

❌

Not supported

Bedrock

❌

Not supported

Mistral

❌

Not supported

Cohere

❌

Not supported

HuggingFace

❌

Not supported

### Usage

mcp\_server\_anthropic.py

```python
from pydantic_ai import Agent, MCPServerTool

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    builtin_tools=[
        MCPServerTool(
            id='deepwiki',
            url='https://mcp.deepwiki.com/mcp',  # (1)
        )
    ]
)

result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.')
print(result.output)
"""
The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications.
"""
```

1.  The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization.

_(This example is complete, it can be run "as is")_

With OpenAI, you must use their Responses API to access the MCP server tool:

mcp\_server\_openai.py

```python
from pydantic_ai import Agent, MCPServerTool

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[
        MCPServerTool(
            id='deepwiki',
            url='https://mcp.deepwiki.com/mcp',  # (1)
        )
    ]
)

result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.')
print(result.output)
"""
The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications.
"""
```

1.  The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization.

_(This example is complete, it can be run "as is")_

### Configuration Options

The `MCPServerTool` supports several configuration parameters for custom MCP servers:

mcp\_server\_configured\_url.py

```python
import os

from pydantic_ai import Agent, MCPServerTool

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[
        MCPServerTool(
            id='github',
            url='https://api.githubcopilot.com/mcp/',
            authorization_token=os.getenv('GITHUB_ACCESS_TOKEN', 'mock-access-token'),  # (1)
            allowed_tools=['search_repositories', 'list_commits'],
            description='GitHub MCP server',
            headers={'X-Custom-Header': 'custom-value'},
        )
    ]
)

result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.')
print(result.output)
"""
The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications.
"""
```

1.  The [GitHub MCP server](https://github.com/github/github-mcp-server) requires an authorization token.

For OpenAI Responses, you can use a [connector](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) by specifying a special `x-openai-connector:` URL:

_(This example is complete, it can be run "as is")_

mcp\_server\_configured\_connector\_id.py

```python
import os

from pydantic_ai import Agent, MCPServerTool

agent = Agent(
    'openai-responses:gpt-5.2',
    builtin_tools=[
        MCPServerTool(
            id='google-calendar',
            url='x-openai-connector:connector_googlecalendar',
            authorization_token=os.getenv('GOOGLE_API_KEY', 'mock-api-key'), # (1)
        )
    ]
)

result = agent.run_sync('What do I have on my calendar today?')
print(result.output)
#> You're going to spend all day playing with Pydantic AI.
```

1.  OpenAI's Google Calendar connector requires an [authorization token](https://platform.openai.com/docs/guides/tools-connectors-mcp#authorizing-a-connector).

_(This example is complete, it can be run "as is")_

#### Provider Support

Parameter

OpenAI

Anthropic

xAI

`authorization_token`

✅

✅

✅

`allowed_tools`

✅

✅

✅

`description`

✅

❌

✅

`headers`

✅

❌

✅

## File Search Tool

The [`FileSearchTool`](/docs/ai/api/pydantic-ai/builtin_tools/#pydantic_ai.builtin_tools.FileSearchTool) enables your agent to search through uploaded files using vector search, providing a fully managed Retrieval-Augmented Generation (RAG) system. This tool handles file storage, chunking, embedding generation, and context injection into prompts.

### Provider Support

Provider

Supported

Notes

OpenAI Responses

✅

Full feature support. Requires files to be uploaded to vector stores via the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files). To include search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart) available via [`ModelResponse.builtin_tool_calls`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ModelResponse.builtin_tool_calls), enable the [`OpenAIResponsesModelSettings.openai_include_file_search_results`](/docs/ai/api/models/openai/#pydantic_ai.models.openai.OpenAIResponsesModelSettings.openai_include_file_search_results) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

Google (Gemini)

✅

Requires files to be uploaded via the [Gemini Files API](https://ai.google.dev/gemini-api/docs/files). Files are automatically deleted after 48 hours. Supports up to 2 GB per file and 20 GB per project. Using built-in tools and function tools (including [output tools](/docs/ai/core-concepts/output#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](/docs/ai/core-concepts/output#prompted-output) instead.

xAI

✅

Mapped to xAI collections search. Requires collection IDs. To include search results on the [`BuiltinToolReturnPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.BuiltinToolReturnPart), enable the [`XaiModelSettings.xai_include_collections_search_output`](/docs/ai/api/models/xai/#pydantic_ai.models.xai.XaiModelSettings.xai_include_collections_search_output) [model setting](/docs/ai/core-concepts/agent#model-run-settings).

Google (Vertex AI)

❌

Anthropic

❌

Not supported

Groq

❌

Not supported

OpenAI Chat Completions

❌

Not supported

Bedrock

❌

Not supported

Mistral

❌

Not supported

Cohere

❌

Not supported

HuggingFace

❌

Not supported

Outlines

❌

Not supported

### Usage

#### OpenAI Responses

With OpenAI, you need to first [upload files to a vector store](https://platform.openai.com/docs/assistants/tools/file-search), then reference the vector store IDs when using the `FileSearchTool`.

file\_search\_openai\_upload.py

```py
import asyncio

from pydantic_ai import Agent, FileSearchTool
from pydantic_ai.models.openai import OpenAIResponsesModel


async def main():
    model = OpenAIResponsesModel('gpt-5.2')

    with open('my_document.txt', 'rb') as f:
        file = await model.client.files.create(file=f, purpose='assistants')

    vector_store = await model.client.vector_stores.create(name='my-docs')
    await model.client.vector_stores.files.create(
        vector_store_id=vector_store.id,
        file_id=file.id
    )

    agent = Agent(
        model,
        builtin_tools=[FileSearchTool(file_store_ids=[vector_store.id])]
    )

    result = await agent.run('What information is in my documents about pydantic?')
    print(result.output)
    #> Based on your documents, Pydantic is a data validation library for Python...

asyncio.run(main())
```

#### Google (Gemini)

With Gemini, you need to first [create a file search store via the Files API](https://ai.google.dev/gemini-api/docs/files), then reference the file search store names.

file\_search\_google\_upload.py

```py
import asyncio

from pydantic_ai import Agent, FileSearchTool
from pydantic_ai.models.google import GoogleModel


async def main():
    model = GoogleModel('gemini-3-flash-preview')

    store = await model.client.aio.file_search_stores.create(
        config={'display_name': 'my-docs'}
    )

    with open('my_document.txt', 'rb') as f:
        await model.client.aio.file_search_stores.upload_to_file_search_store(
            file_search_store_name=store.name,
            file=f,
            config={'mime_type': 'text/plain'}
        )

    agent = Agent(
        model,
        builtin_tools=[FileSearchTool(file_store_ids=[store.name])]
    )

    result = await agent.run('Summarize the key points from my uploaded documents.')
    print(result.output)
    #> The documents discuss the following key points: ...

asyncio.run(main())
```

#### xAI

With xAI, `FileSearchTool` maps to the [collections search](https://docs.x.ai/developers/tools/collection-search) tool. Pass collection IDs as `file_store_ids`.

file\_search\_xai.py

```py
import asyncio

from pydantic_ai import Agent, FileSearchTool


async def main():
    agent = Agent(
        'xai:grok-4-1-fast',
        builtin_tools=[FileSearchTool(file_store_ids=['collection_abc123'])]
    )

    result = await agent.run('What does the collection say about pydantic?')
    print(result.output)
    #> Based on the collection, Pydantic is ...

asyncio.run(main())
```

## API Reference

For complete API documentation, see the [API Reference](/docs/ai/api/pydantic-ai/builtin_tools).

---

# [Common Tools](https://pydantic.dev/docs/ai/tools-toolsets/common-tools/)

# Common Tools

Pydantic AI ships with native tools that can be used to enhance your agent's capabilities.

## DuckDuckGo Search Tool

The DuckDuckGo search tool allows you to search the web for information. It is built on top of the [DuckDuckGo API](https://github.com/deedy5/ddgs).

### Installation

To use [`duckduckgo_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.duckduckgo.duckduckgo_search_tool), you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `duckduckgo` optional group:

-   [pip](#tab-panel-106)
-   [uv](#tab-panel-107)

Terminal

```bash
pip install "pydantic-ai-slim[duckduckgo]"
```

Terminal

```bash
uv add "pydantic-ai-slim[duckduckgo]"
```

### Usage

Here's an example of how you can use the DuckDuckGo search tool with an agent:

duckduckgo\_search.py

```python
from pydantic_ai import Agent
from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool

agent = Agent(
    'openai:gpt-5.2',
    tools=[duckduckgo_search_tool()],
    instructions='Search DuckDuckGo for the given query and return the results.',
)

result = agent.run_sync(
    'Can you list the top five highest-grossing animated films of 2025?'
)
print(result.output)
"""
I looked into several sources on animated box‐office performance in 2025, and while detailed
rankings can shift as more money is tallied, multiple independent reports have already
highlighted a couple of record‐breaking shows. For example:

• Ne Zha 2 - News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others)
    have reported that this Chinese title not only became the highest‑grossing animated film of 2025
    but also broke records as the highest‑grossing non‑English animated film ever. One article noted
    its run exceeded US$1.7 billion.
• Inside Out 2 - According to data shared on Statista and in industry news, this Pixar sequel has been
    on pace to set new records (with some sources even noting it as the highest‑grossing animated film
    ever, as of January 2025).

Beyond those two, some entertainment trade sites (for example, a Just Jared article titled
"Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader
top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during
a box‑office run, many of the industry trackers have begun to single out five films as the biggest
earners so far in 2025.

Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a
single, universally accepted list with final numbers that names the complete top five. (Box‑office
rankings, especially mid‑year, can be fluid as films continue to add to their totals.)

Based on what several sources note so far, the two undisputed leaders are:
1. Ne Zha 2
2. Inside Out 2

The remaining top spots (3-5) are reported by some outlets in their "Top‑10 Animated Films"
lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off
date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th
highest‑grossing films), I recommend checking resources like:
• Wikipedia's "List of animated feature films of 2025" page
• Box‑office tracking sites (such as Box Office Mojo or The Numbers)
• Trade articles like the one on Just Jared

To summarize with what is clear from the current reporting:
1. Ne Zha 2
2. Inside Out 2
3-5. Other animated films (yet to be definitively finalized across all reporting outlets)

If you're looking for a final, consensus list of the top five, it may be best to wait until
the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source.

Would you like help finding a current source or additional details on where to look for the complete updated list?
"""
```

## Web Fetch Tool

The web fetch tool allows your agent to fetch the content of web pages and convert them to markdown. It uses [SSRF protection](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery) to prevent server-side request forgery attacks.

### Installation

To use `web_fetch_tool`, you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `web-fetch` optional group:

-   [pip](#tab-panel-108)
-   [uv](#tab-panel-109)

Terminal

```bash
pip install "pydantic-ai-slim[web-fetch]"
```

Terminal

```bash
uv add "pydantic-ai-slim[web-fetch]"
```

### Usage

Here's an example of how you can use the web fetch tool with an agent:

web\_fetch.py

```python
from pydantic_ai import Agent
from pydantic_ai.common_tools.web_fetch import web_fetch_tool

agent = Agent(
    'openai:gpt-5.2',
    tools=[web_fetch_tool()],
    instructions='Fetch web pages and summarize their content.',
)

result = agent.run_sync('What is on https://ai.pydantic.dev?')
print(result.output)
```

Automatic fallback via WebFetch capability

You don't need to use `web_fetch_tool` directly -- the [`WebFetch`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WebFetch) capability automatically uses it as a local fallback when the model doesn't support builtin URL fetching.

## Tavily Search Tool

Note

Tavily is a paid service, but they have free credits to explore their product.

You need to [sign up for an account](https://app.tavily.com/home) and get an API key to use the Tavily search tool.

The Tavily search tool allows you to search the web for information. It is built on top of the [Tavily API](https://tavily.com/).

### Installation

To use [`tavily_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.tavily.tavily_search_tool), you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `tavily` optional group:

-   [pip](#tab-panel-110)
-   [uv](#tab-panel-111)

Terminal

```bash
pip install "pydantic-ai-slim[tavily]"
```

Terminal

```bash
uv add "pydantic-ai-slim[tavily]"
```

### Usage

Here's an example of how you can use the Tavily search tool with an agent:

tavily\_search.py

```python
import os

from pydantic_ai import Agent
from pydantic_ai.common_tools.tavily import tavily_search_tool

api_key = os.getenv('TAVILY_API_KEY')
assert api_key is not None

agent = Agent(
    'openai:gpt-5.2',
    tools=[tavily_search_tool(api_key)],
    instructions='Search Tavily for the given query and return the results.',
)

result = agent.run_sync('Tell me the top news in the GenAI world, give me links.')
print(result.output)
"""
Here are some of the top recent news articles related to GenAI:

1. How CLEAR users can improve risk analysis with GenAI - Thomson Reuters
   Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/
   (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.)

2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think - FT.com
   Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1
   (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.)

3. The Essential Guide to Generative AI - Virtualization Review
   Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx
   (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.)

Feel free to click on the links to dive deeper into each story!
"""
```

### Configuring Parameters

The `tavily_search_tool` factory accepts optional parameters that control search behavior. `max_results` is always developer-controlled and never appears in the LLM tool schema. Other parameters, when provided, are fixed for all searches and hidden from the LLM's tool schema. Parameters left unset remain available for the LLM to set per-call.

For example, you can lock in `max_results` and `include_domains` at tool creation time while still letting the LLM control `exclude_domains`:

tavily\_domain\_filtering.py

```python
import os

from pydantic_ai import Agent
from pydantic_ai.common_tools.tavily import tavily_search_tool

api_key = os.getenv('TAVILY_API_KEY')
assert api_key is not None

agent = Agent(
    'openai:gpt-5.2',
    tools=[tavily_search_tool(api_key, max_results=5, include_domains=['arxiv.org'])],
    instructions='Search for information and return the results.',
)

result = agent.run_sync(
    'Find recent papers about transformer architectures'
)
print(result.output)
"""
Here are some recent papers about transformer architectures from arxiv.org:

1. "Attention Is All You Need" - The foundational paper on the Transformer model.
2. "FlashAttention: Fast and Memory-Efficient Exact Attention" - Proposes an IO-aware attention algorithm.
"""
```

## Exa Search Tool

Note

Exa is a paid service with free credits to explore their product.

You need to [sign up for an account](https://dashboard.exa.ai) and get an API key to use the Exa tools.

Exa is a neural search engine that finds high-quality, relevant results across billions of web pages. It provides several tools including web search, finding similar pages, content retrieval, and AI-powered answers.

### Installation

To use Exa tools, you need to install [`pydantic-ai-slim`](/docs/ai/overview/install#slim-install) with the `exa` optional group:

-   [pip](#tab-panel-112)
-   [uv](#tab-panel-113)

Terminal

```bash
pip install "pydantic-ai-slim[exa]"
```

Terminal

```bash
uv add "pydantic-ai-slim[exa]"
```

### Usage

You can use Exa tools individually or as a toolset. The following tools are available:

-   [`exa_search_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_search_tool): Search the web with various search types (auto, keyword, neural, fast, deep)
-   [`exa_find_similar_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_find_similar_tool): Find pages similar to a given URL
-   [`exa_get_contents_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_get_contents_tool): Get full text content from URLs
-   [`exa_answer_tool`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.exa_answer_tool): Get AI-powered answers with citations

#### Using Individual Tools

exa\_search.py

```python
import os

from pydantic_ai import Agent
from pydantic_ai.common_tools.exa import exa_search_tool

api_key = os.getenv('EXA_API_KEY')
assert api_key is not None

agent = Agent(
    'openai:gpt-5.2',
    tools=[exa_search_tool(api_key, num_results=5, max_characters=1000)],
    system_prompt='Search the web for information using Exa.',
)

result = agent.run_sync('What are the latest developments in quantum computing?')
print(result.output)
```

#### Using ExaToolset

For better efficiency when using multiple Exa tools, use [`ExaToolset`](/docs/ai/api/pydantic-ai/common_tools/#pydantic_ai.common_tools.exa.ExaToolset) which shares a single API client across all tools. You can configure which tools to include:

exa\_toolset.py

```python
import os

from pydantic_ai import Agent
from pydantic_ai.common_tools.exa import ExaToolset

api_key = os.getenv('EXA_API_KEY')
assert api_key is not None

toolset = ExaToolset(
    api_key,
    num_results=5,
    max_characters=1000,  # Limit text content to control token usage
    include_search=True,  # Include the search tool (default: True)
    include_find_similar=True,  # Include the find_similar tool (default: True)
    include_get_contents=False,  # Exclude the get_contents tool
    include_answer=True,  # Include the answer tool (default: True)
)

agent = Agent(
    'openai:gpt-5.2',
    toolsets=[toolset],
    system_prompt='You have access to Exa search tools to find information on the web.',
)

result = agent.run_sync('Find recent AI research papers and summarize the key findings.')
print(result.output)
```

---

# [Deferred Tools](https://pydantic.dev/docs/ai/tools-toolsets/deferred-tools/)

# Deferred Tools

There are a few scenarios where the model should be able to call a tool that should not or cannot be executed during the same agent run inside the same Python process:

-   it may need to be approved by the user first
-   it may depend on an upstream service, frontend, or user to provide the result
-   the result could take longer to generate than it's reasonable to keep the agent process running

To support these use cases, Pydantic AI provides the concept of deferred tools, which come in two flavors documented below:

-   tools that [require approval](#human-in-the-loop-tool-approval)
-   tools that are [executed externally](#external-tool-execution)

When the model calls a deferred tool, there are two ways to resolve it:

-   **Resolve it inline**, using a [`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls) [capability](/docs/ai/core-concepts/capabilities) with a handler that resolves some or all of the pending calls. The agent run continues in a single call without needing to end and restart -- use this when the resolver (e.g. an approval gate, an external service client) lives in the same process as the agent. See [Resolving deferred calls with a handler](#resolving-deferred-calls-with-a-handler).
-   **End the run** with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object containing information about the deferred tool calls; the caller gathers approvals/results and then starts a new agent run with the original run's [message history](/docs/ai/core-concepts/message-history) plus a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object. Use this when the resolver lives outside the agent process -- e.g. a UI adapter that surfaces pending calls to a user and starts a follow-up run once it has their response.

The two flows compose: a handler can resolve a subset of calls and let the rest bubble up as `DeferredToolRequests` output for an outer caller to handle.

The stop-the-world flow requires `DeferredToolRequests` to be in the `Agent`'s [`output_type`](/docs/ai/core-concepts/output#structured-output) so that the possible types of the agent run output are correctly inferred. If your agent can also be used in a context where no deferred tools are available and you don't want to deal with that type everywhere you use the agent, you can instead pass the `output_type` argument when you run the agent using [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). Note that the run-time `output_type` overrides the one specified at construction time (for type inference reasons), so you'll need to include the original output type explicitly.

## Resolving deferred calls with a handler

The recommended way to handle deferred tool calls is to register a [`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls) [capability](/docs/ai/core-concepts/capabilities) whose handler receives the [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) and returns a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) resolving some or all of them. The tool execution pipeline applies the results inline and the agent run continues in a single call, as if the deferred tools had returned normally.

With a handler in place, `DeferredToolRequests` no longer needs to be declared as an output type -- unless you also want unresolved calls to bubble up to the caller (see below).

[`DeferredToolRequests.build_results()`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests.build_results) is a convenience constructor -- it validates that every tool call ID refers to a pending request of the correct kind, and accepts `approve_all=True` to auto-approve any approval requests not otherwise specified.

deferred\_tool\_handler.py

```python
from pydantic_ai import (
  Agent,
  ApprovalRequired,
  CallDeferred,
  DeferredToolRequests,
  DeferredToolResults,
  RunContext,
  ToolDenied,
)
from pydantic_ai.capabilities import HandleDeferredToolCalls


async def handle_deferred(
  ctx: RunContext[None], requests: DeferredToolRequests
) -> DeferredToolResults:
  approvals: dict[str, bool | ToolDenied] = {}
  for call in requests.approvals:
      if call.tool_name == 'delete_file':
          approvals[call.tool_call_id] = ToolDenied('Deleting files is not allowed')
      else:
          approvals[call.tool_call_id] = True

  calls = {call.tool_call_id: f'(external result for {call.tool_name})' for call in requests.calls}

  return requests.build_results(approvals=approvals, calls=calls)


agent = Agent(
  'openai:gpt-5.2',
  capabilities=[HandleDeferredToolCalls(handler=handle_deferred)],
)


@agent.tool_plain(requires_approval=True)
def delete_file(path: str) -> str:
  return f'File {path!r} deleted'  # (1)


@agent.tool
def update_file(ctx: RunContext, path: str) -> str:
  if path == '.env' and not ctx.tool_call_approved:
      raise ApprovalRequired
  return f'File {path!r} updated'


@agent.tool_plain
async def send_to_worker(task: str) -> str:
  raise CallDeferred  # (2)
```

Never reached here -- the handler denies this call, so the model sees the denial message instead.

The handler supplies the result for this external call, so the tool body just signals the deferral.

If the handler declines to resolve some or all of the calls (by omitting them from the returned [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) or returning `None`), the next [`HandleDeferredToolCalls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.HandleDeferredToolCalls) (or any other capability that overrides the [`handle_deferred_tool_calls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.handle_deferred_tool_calls) hook) gets a chance, and any still-unresolved calls bubble up as a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output. To allow that bubble-up, include `DeferredToolRequests` in the agent's `output_type` -- so you can combine inline handling with the stop-the-world flow when it makes sense.

If you're [building a custom capability](/docs/ai/core-concepts/capabilities#building-custom-capabilities) that needs to resolve approvals or external calls itself (e.g. a sandbox that exposes deferred tools), override the [`handle_deferred_tool_calls`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.handle_deferred_tool_calls) hook directly on your capability instead of registering a separate `HandleDeferredToolCalls`. The same hook is also available via the [`Hooks`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.Hooks) capability -- see [Hooks](/docs/ai/core-concepts/hooks#deferred-tool-call-hook).

The sections below describe the two kinds of deferred tools the handler can resolve, as well as the alternative stop-the-world flow for each. See [Capabilities](/docs/ai/core-concepts/capabilities) for how multiple capabilities compose, including [`WrapperCapability`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.WrapperCapability) and the `capabilities=[...]` list.

## Human-in-the-Loop Tool Approval

If a tool function always requires approval, you can pass the `requires_approval=True` argument to the [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator, [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator, [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) class, [`FunctionToolset.tool`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool) decorator, or [`FunctionToolset.add_function()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_function) method. Inside the function, you can then assume that the tool call has been approved.

If whether a tool function requires approval depends on the tool call arguments or the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g. [dependencies](/docs/ai/core-concepts/dependencies) or message history), you can raise the [`ApprovalRequired`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ApprovalRequired) exception from the tool function. The [`RunContext.tool_call_approved`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_approved) property will be `True` if the tool call has already been approved.

To require approval for calls to tools provided by a [toolset](/docs/ai/tools-toolsets/toolsets) (like an [MCP server](/docs/ai/mcp/client)), see the [`ApprovalRequiredToolset` documentation](/docs/ai/tools-toolsets/toolsets#requiring-tool-approval).

When the model calls a tool that requires approval, the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with an `approvals` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID.

Once you've gathered the user's approvals or denials, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with an `approvals` dictionary that maps each tool call ID to a boolean, a [`ToolApproved`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolApproved) object (with optional `override_args`), or a [`ToolDenied`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDenied) object (with an optional custom `message` to provide to the model). You can also provide a `metadata` dictionary on `DeferredToolResults` that maps each tool call ID to a dictionary of metadata that will be available in the tool's [`RunContext.tool_call_metadata`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_metadata) attribute. This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history).

Here's an example that shows how to require approval for all file deletions, and for updates of specific protected files:

tool\_requires\_approval.py

```python
from pydantic_ai import (
  Agent,
  ApprovalRequired,
  DeferredToolRequests,
  DeferredToolResults,
  RunContext,
  ToolDenied,
)

agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests])

PROTECTED_FILES = {'.env'}


@agent.tool
def update_file(ctx: RunContext, path: str, content: str) -> str:
  if path in PROTECTED_FILES and not ctx.tool_call_approved:
      raise ApprovalRequired(metadata={'reason': 'protected'})  # (1)
  return f'File {path!r} updated: {content!r}'


@agent.tool_plain(requires_approval=True)
def delete_file(path: str) -> str:
  return f'File {path!r} deleted'


result = agent.run_sync('Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`')
messages = result.all_messages()

assert isinstance(result.output, DeferredToolRequests)
requests = result.output
print(requests)
"""
DeferredToolRequests(
  calls=[],
  approvals=[
      ToolCallPart(
          tool_name='update_file',
          args={'path': '.env', 'content': ''},
          tool_call_id='update_file_dotenv',
      ),
      ToolCallPart(
          tool_name='delete_file',
          args={'path': '__init__.py'},
          tool_call_id='delete_file',
      ),
  ],
  metadata={'update_file_dotenv': {'reason': 'protected'}},
)
"""

results = DeferredToolResults()
for call in requests.approvals:
  result = False
  if call.tool_name == 'update_file':
      # Approve all updates
      result = True
  elif call.tool_name == 'delete_file':
      # deny all deletes
      result = ToolDenied('Deleting files is not allowed')

  results.approvals[call.tool_call_id] = result

result = agent.run_sync(
  'Now create a backup of README.md',  # (2)
  message_history=messages,
  deferred_tool_results=results,
)
print(result.output)
"""
Here's what I've done:
- Attempted to delete __init__.py, but deletion is not allowed.
- Updated README.md with: Hello, world!
- Cleared .env (set to empty).
- Created a backup at README.md.bak containing: Hello, world!

If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know.
"""
print(result.all_messages())
"""
[
  ModelRequest(
      parts=[
          UserPromptPart(
              content='Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`',
              timestamp=datetime.datetime(...),
          )
      ],
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelResponse(
      parts=[
          ToolCallPart(
              tool_name='delete_file',
              args={'path': '__init__.py'},
              tool_call_id='delete_file',
          ),
          ToolCallPart(
              tool_name='update_file',
              args={'path': 'README.md', 'content': 'Hello, world!'},
              tool_call_id='update_file_readme',
          ),
          ToolCallPart(
              tool_name='update_file',
              args={'path': '.env', 'content': ''},
              tool_call_id='update_file_dotenv',
          ),
      ],
      usage=RequestUsage(input_tokens=63, output_tokens=21),
      model_name='gpt-5.2',
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelRequest(
      parts=[
          ToolReturnPart(
              tool_name='update_file',
              content="File 'README.md' updated: 'Hello, world!'",
              tool_call_id='update_file_readme',
              timestamp=datetime.datetime(...),
          )
      ],
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelRequest(
      parts=[
          ToolReturnPart(
              tool_name='update_file',
              content="File '.env' updated: ''",
              tool_call_id='update_file_dotenv',
              timestamp=datetime.datetime(...),
          ),
          ToolReturnPart(
              tool_name='delete_file',
              content='Deleting files is not allowed',
              tool_call_id='delete_file',
              timestamp=datetime.datetime(...),
              outcome='denied',
          ),
          UserPromptPart(
              content='Now create a backup of README.md',
              timestamp=datetime.datetime(...),
          ),
      ],
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelResponse(
      parts=[
          ToolCallPart(
              tool_name='update_file',
              args={'path': 'README.md.bak', 'content': 'Hello, world!'},
              tool_call_id='update_file_backup',
          )
      ],
      usage=RequestUsage(input_tokens=86, output_tokens=31),
      model_name='gpt-5.2',
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelRequest(
      parts=[
          ToolReturnPart(
              tool_name='update_file',
              content="File 'README.md.bak' updated: 'Hello, world!'",
              tool_call_id='update_file_backup',
              timestamp=datetime.datetime(...),
          )
      ],
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
  ModelResponse(
      parts=[
          TextPart(
              content="Here's what I've done:
- Attempted to delete __init__.py, but deletion is not allowed.
- Updated README.md with: Hello, world!
- Cleared .env (set to empty).
- Created a backup at README.md.bak containing: Hello, world!

If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know."
          )
      ],
      usage=RequestUsage(input_tokens=93, output_tokens=89),
      model_name='gpt-5.2',
      timestamp=datetime.datetime(...),
      run_id='...',
      conversation_id='...',
  ),
]
"""
```

The optional `metadata` parameter can attach arbitrary context to deferred tool calls, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`.

This second agent run continues from where the first run left off, providing the tool approval results and optionally a new `user_prompt` to give the model additional instructions alongside the deferred results.

_(This example is complete, it can be run "as is")_

## External Tool Execution

When the result of a tool call cannot be generated inside the same agent run in which it was called, the tool is considered to be external. Examples of external tools are client-side tools implemented by a web or app frontend, and slow tasks that are passed off to a background worker or external service instead of keeping the agent process running.

If whether a tool call should be executed externally depends on the tool call arguments, the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) (e.g. [dependencies](/docs/ai/core-concepts/dependencies) or message history), or how long the task is expected to take, you can define a tool function and conditionally raise the [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) exception. Before raising the exception, the tool function would typically schedule some background task and pass along the [`RunContext.tool_call_id`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.tool_call_id) so that the result can be matched to the deferred tool call later.

If a tool is always executed externally and its definition is provided to your code along with a JSON schema for its arguments, you can use an [`ExternalToolset`](/docs/ai/tools-toolsets/toolsets#external-toolset). If the external tools are known up front and you don't have the arguments JSON schema handy, you can also define a tool function with the appropriate signature that does nothing but raise the [`CallDeferred`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.CallDeferred) exception.

When the model calls an external tool, the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with a `calls` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID.

Once the tool call results are ready, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object, or a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception in case the tool call failed and the model should [try again](/docs/ai/tools-toolsets/tools-advanced#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history).

Here's an example that shows how to move a task that takes a while to complete to the background and return the result to the model once the task is complete:

external\_tool.py

```python
import asyncio
from dataclasses import dataclass
from typing import Any

from pydantic_ai import (
  Agent,
  CallDeferred,
  DeferredToolRequests,
  DeferredToolResults,
  ModelRetry,
  RunContext,
)


@dataclass
class TaskResult:
  task_id: str
  result: Any


async def calculate_answer_task(task_id: str, question: str) -> TaskResult:
  await asyncio.sleep(1)
  return TaskResult(task_id=task_id, result=42)


agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests])

tasks: list[asyncio.Task[TaskResult]] = []


@agent.tool
async def calculate_answer(ctx: RunContext, question: str) -> str:
  task_id = f'task_{len(tasks)}'  # (1)
  task = asyncio.create_task(calculate_answer_task(task_id, question))
  tasks.append(task)

  raise CallDeferred(metadata={'task_id': task_id})  # (2)


async def main():
  result = await agent.run('Calculate the answer to the ultimate question of life, the universe, and everything')
  messages = result.all_messages()

  assert isinstance(result.output, DeferredToolRequests)
  requests = result.output
  print(requests)
  """
  DeferredToolRequests(
      calls=[
          ToolCallPart(
              tool_name='calculate_answer',
              args={
                  'question': 'the ultimate question of life, the universe, and everything'
              },
              tool_call_id='pyd_ai_tool_call_id',
          )
      ],
      approvals=[],
      metadata={'pyd_ai_tool_call_id': {'task_id': 'task_0'}},
  )
  """

  done, _ = await asyncio.wait(tasks)  # (3)
  task_results = [task.result() for task in done]
  task_results_by_task_id = {result.task_id: result.result for result in task_results}

  results = DeferredToolResults()
  for call in requests.calls:
      try:
          task_id = requests.metadata[call.tool_call_id]['task_id']
          result = task_results_by_task_id[task_id]
      except KeyError:
          result = ModelRetry('No result for this tool call was found.')

      results.calls[call.tool_call_id] = result

  result = await agent.run(message_history=messages, deferred_tool_results=results)
  print(result.output)
  #> The answer to the ultimate question of life, the universe, and everything is 42.
  print(result.all_messages())
  """
  [
      ModelRequest(
          parts=[
              UserPromptPart(
                  content='Calculate the answer to the ultimate question of life, the universe, and everything',
                  timestamp=datetime.datetime(...),
              )
          ],
          timestamp=datetime.datetime(...),
          run_id='...',
          conversation_id='...',
      ),
      ModelResponse(
          parts=[
              ToolCallPart(
                  tool_name='calculate_answer',
                  args={
                      'question': 'the ultimate question of life, the universe, and everything'
                  },
                  tool_call_id='pyd_ai_tool_call_id',
              )
          ],
          usage=RequestUsage(input_tokens=63, output_tokens=13),
          model_name='gpt-5.2',
          timestamp=datetime.datetime(...),
          run_id='...',
          conversation_id='...',
      ),
      ModelRequest(
          parts=[
              ToolReturnPart(
                  tool_name='calculate_answer',
                  content=42,
                  tool_call_id='pyd_ai_tool_call_id',
                  timestamp=datetime.datetime(...),
              )
          ],
          timestamp=datetime.datetime(...),
          run_id='...',
          conversation_id='...',
      ),
      ModelResponse(
          parts=[
              TextPart(
                  content='The answer to the ultimate question of life, the universe, and everything is 42.'
              )
          ],
          usage=RequestUsage(input_tokens=64, output_tokens=28),
          model_name='gpt-5.2',
          timestamp=datetime.datetime(...),
          run_id='...',
          conversation_id='...',
      ),
  ]
  """
```

Generate a task ID that can be tracked independently of the tool call ID.

The optional `metadata` parameter passes the `task_id` so it can be matched with results later, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`.

In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete.

_(This example is complete, it can be run "as is" -- you'll need to add `asyncio.run(main())` to run `main`)_

## See Also

-   [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration
-   [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced) - Custom schemas, dynamic tools, and execution details
-   [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools, including `ExternalToolset` for external tools
-   [Message History](/docs/ai/core-concepts/message-history) - Understanding how to work with message history for deferred tools

---

# [Third-Party Tools](https://pydantic.dev/docs/ai/tools-toolsets/third-party-tools/)

# Third-Party Tools

Pydantic AI supports integration with various third-party tool libraries, allowing you to leverage existing tool ecosystems in your agents. Third-party tools are also available as [capabilities](/docs/ai/core-concepts/capabilities#third-party-capabilities) -- see [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem.

## MCP Tools

See the [MCP Client](/docs/ai/mcp/client) documentation for how to use MCP servers with Pydantic AI as [toolsets](/docs/ai/tools-toolsets/toolsets).

## LangChain Tools

If you'd like to use a tool from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the [`tool_from_langchain`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.tool_from_langchain) convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid.

You will need to install the `langchain-community` package and any others required by the tool in question.

Here is how you can use the LangChain `DuckDuckGoSearchRun` tool, which requires the `ddgs` package:

```python
from langchain_community.tools import DuckDuckGoSearchRun

from pydantic_ai import Agent
from pydantic_ai.ext.langchain import tool_from_langchain

search = DuckDuckGoSearchRun()
search_tool = tool_from_langchain(search)

agent = Agent(
  'google-gla:gemini-3-flash-preview',
  tools=[search_tool],
)

result = agent.run_sync('What is the release date of Elden Ring Nightreign?')  # (1)
print(result.output)
#> Elden Ring Nightreign is planned to be released on May 30, 2025.
```

The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024).

If you'd like to use multiple LangChain tools or a LangChain [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits), you can use the [`LangChainToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.LangChainToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of LangChain tools:

```python
from langchain_community.agent_toolkits import SlackToolkit

from pydantic_ai import Agent
from pydantic_ai.ext.langchain import LangChainToolset

toolkit = SlackToolkit()
toolset = LangChainToolset(toolkit.get_tools())

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
# ...
```

## ACI.dev Tools

If you'd like to use a tool from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the [`tool_from_aci`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.tool_from_aci) convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid.

You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function.

Here is how you can use the ACI.dev `TAVILY__SEARCH` tool:

```python
import os

from pydantic_ai import Agent
from pydantic_ai.ext.aci import tool_from_aci

tavily_search = tool_from_aci(
  'TAVILY__SEARCH',
  linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'),
)

agent = Agent(
  'google-gla:gemini-3-flash-preview',
  tools=[tavily_search],
)

result = agent.run_sync('What is the release date of Elden Ring Nightreign?')  # (1)
print(result.output)
#> Elden Ring Nightreign is planned to be released on May 30, 2025.
```

The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024).

If you'd like to use multiple ACI.dev tools, you can use the [`ACIToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.ACIToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of ACI tool names as well as the `linked_account_owner_id`:

```python
import os

from pydantic_ai import Agent
from pydantic_ai.ext.aci import ACIToolset

toolset = ACIToolset(
    [
        'OPEN_WEATHER_MAP__CURRENT_WEATHER',
        'OPEN_WEATHER_MAP__FORECAST',
    ],
    linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'),
)

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
```

## See Also

-   [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration
-   [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools
-   [MCP Client](/docs/ai/mcp/client) - Using MCP servers with Pydantic AI
-   [LangChain Toolsets](/docs/ai/tools-toolsets/toolsets#langchain-tools) - Using LangChain toolsets
-   [ACI.dev Toolsets](/docs/ai/tools-toolsets/toolsets#aci-tools) - Using ACI.dev toolsets

---

# [Function Tools](https://pydantic.dev/docs/ai/tools-toolsets/tools/)

# Function Tools

Function tools provide a mechanism for models to perform actions and retrieve extra information to help them generate a response.

They're useful when you want to enable the model to take some action and use the result, when it is impractical or impossible to put all the context an agent might need into the instructions, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool.

If you want a model to be able to call a function as its final action, without the result being sent back to the model, you can use an [output function](/docs/ai/core-concepts/output#output-functions) instead.

There are a number of ways to register tools with an agent:

-   via the [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator -- for tools that need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   via the [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator -- for tools that do not need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   via the [`tools`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent` which can take either plain functions, or instances of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)

For more advanced use cases, the [toolsets](/docs/ai/tools-toolsets/toolsets) feature lets you manage collections of tools (built by you or provided by an [MCP server](/docs/ai/mcp/client) or other [third party](/docs/ai/tools-toolsets/third-party-tools#third-party-tools)) and register them with an agent in one go via the [`toolsets`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent`. Internally, all `tools` and `toolsets` are gathered into a single [combined toolset](/docs/ai/tools-toolsets/toolsets#combining-toolsets) that's made available to the model.

Function tools vs. RAG

Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) -- they augment what the model can do by letting it request extra information.

The main semantic difference between Pydantic AI Tools and RAG is RAG is synonymous with vector search, while Pydantic AI tools are more general-purpose. For vector search, you can use our [embeddings](/docs/ai/guides/embeddings) support to generate embeddings across multiple providers.

Function Tools vs. Structured Outputs

As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for [structured output](/docs/ai/core-concepts/output) when using the default [tool output mode](/docs/ai/core-concepts/output#tool-output), thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output.

## Registering via Decorator

`@agent.tool` is considered the default decorator since in the majority of cases tools will need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext).

Here's an example using both:

dice\_game.py

```python
import random

from pydantic_ai import Agent, RunContext

agent = Agent(
  'google-gla:gemini-3-flash-preview',  # (1)
  deps_type=str,  # (2)
  instructions=(
      "You're a dice game, you should roll the die and see if the number "
      "you get back matches the user's guess. If so, tell them they're a winner. "
      "Use the player's name in the response."
  ),
)


@agent.tool_plain  # (3)
def roll_dice() -> str:
  """Roll a six-sided die and return the result."""
  return str(random.randint(1, 6))


@agent.tool  # (4)
def get_player_name(ctx: RunContext[str]) -> str:
  """Get the player's name."""
  return ctx.deps


dice_result = agent.run_sync('My guess is 4', deps='Anne')  # (5)
print(dice_result.output)
#> Congratulations Anne, you guessed correctly! You're a winner!
```

This is a pretty simple task, so we can use the fast and cheap Gemini flash model.

We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency.

This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case.

This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case.

Run the agent, passing the player's name as the dependency.

_(This example is complete, it can be run "as is")_

Let's print the messages from that game to see what happened:

dice\_game\_messages.py

```python
from dice_game import dice_result

print(dice_result.all_messages())
"""
[
    ModelRequest(
        parts=[
            UserPromptPart(
                content='My guess is 4',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.",
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            ToolCallPart(
                tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id'
            )
        ],
        usage=RequestUsage(input_tokens=54, output_tokens=2),
        model_name='gemini-3-flash-preview',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
    ModelRequest(
        parts=[
            ToolReturnPart(
                tool_name='roll_dice',
                content='4',
                tool_call_id='pyd_ai_tool_call_id',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.",
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            ToolCallPart(
                tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id'
            )
        ],
        usage=RequestUsage(input_tokens=55, output_tokens=4),
        model_name='gemini-3-flash-preview',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
    ModelRequest(
        parts=[
            ToolReturnPart(
                tool_name='get_player_name',
                content='Anne',
                tool_call_id='pyd_ai_tool_call_id',
                timestamp=datetime.datetime(...),
            )
        ],
        timestamp=datetime.datetime(...),
        instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.",
        run_id='...',
        conversation_id='...',
    ),
    ModelResponse(
        parts=[
            TextPart(
                content="Congratulations Anne, you guessed correctly! You're a winner!"
            )
        ],
        usage=RequestUsage(input_tokens=56, output_tokens=12),
        model_name='gemini-3-flash-preview',
        timestamp=datetime.datetime(...),
        run_id='...',
        conversation_id='...',
    ),
]
"""
```

We can represent this with a diagram:

```mermaid
sequenceDiagram
    participant Agent
    participant LLM

    Note over Agent: Send prompts
    Agent ->> LLM: System: "You're a dice game..."<br>User: "My guess is 4"
    activate LLM
    Note over LLM: LLM decides to use<br>a tool

    LLM ->> Agent: Call tool<br>roll_dice()
    deactivate LLM
    activate Agent
    Note over Agent: Rolls a six-sided die

    Agent -->> LLM: ToolReturn<br>"4"
    deactivate Agent
    activate LLM
    Note over LLM: LLM decides to use<br>another tool

    LLM ->> Agent: Call tool<br>get_player_name()
    deactivate LLM
    activate Agent
    Note over Agent: Retrieves player name
    Agent -->> LLM: ToolReturn<br>"Anne"
    deactivate Agent
    activate LLM
    Note over LLM: LLM constructs final response

    LLM ->> Agent: ModelResponse<br>"Congratulations Anne, ..."
    deactivate LLM
    Note over Agent: Game session complete
```

## Registering via Agent Argument

As well as using the decorators, we can register tools via the `tools` argument to the [`Agent` constructor](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__). This is useful when you want to reuse tools, and can also give more fine-grained control over the tools.

dice\_game\_tool\_kwarg.py

```python
import random

from pydantic_ai import Agent, RunContext, Tool

instructions = """You're a dice game, you should roll the die and see if the number
you get back matches the user's guess. If so, tell them they're a winner.
Use the player's name in the response.
"""


def roll_dice() -> str:
  """Roll a six-sided die and return the result."""
  return str(random.randint(1, 6))


def get_player_name(ctx: RunContext[str]) -> str:
  """Get the player's name."""
  return ctx.deps


agent_a = Agent(
  'google-gla:gemini-3-flash-preview',
  deps_type=str,
  tools=[roll_dice, get_player_name],  # (1)
  instructions=instructions,
)
agent_b = Agent(
  'google-gla:gemini-3-flash-preview',
  deps_type=str,
  tools=[  # (2)
      Tool(roll_dice, takes_ctx=False),
      Tool(get_player_name, takes_ctx=True),
  ],
  instructions=instructions,
)

dice_result = {}
dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar')
dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne')
print(dice_result['a'].output)
#> Tough luck, Yashar, you rolled a 4. Better luck next time.
print(dice_result['b'].output)
#> Congratulations Anne, you guessed correctly! You're a winner!
```

The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext).

`agent_a` and `agent_b` are identical -- but we can use [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](/docs/ai/tools-toolsets/tools-advanced#tool-prepare) method.

_(This example is complete, it can be run "as is")_

## Tool Output

Tools can return anything that Pydantic can serialize to JSON. For advanced output options including multi-modal content and metadata, see [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced#function-tool-output).

## Tool Schema

Function parameters are extracted from the function signature, and all parameters except `RunContext` are used to build the schema for that tool call.

Even better, Pydantic AI extracts the docstring from functions and (thanks to [griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions from the docstring and adds them to the schema.

[Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/#docstrings) extracting parameter descriptions from `google`, `numpy`, and `sphinx` style docstrings. Pydantic AI will infer the format to use based on the docstring, but you can explicitly set it using [`docstring_format`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DocstringFormat). You can also enforce parameter requirements by setting `require_parameter_descriptions=True`. This will raise a [`UserError`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UserError) if a parameter description is missing.

To demonstrate a tool's schema, here we use [`FunctionModel`](/docs/ai/api/models/function/#pydantic_ai.models.function.FunctionModel) to print the schema a model would receive:

tool\_schema.py

```python
from pydantic_ai import Agent, ModelMessage, ModelResponse, TextPart
from pydantic_ai.models.function import AgentInfo, FunctionModel

agent = Agent()


@agent.tool_plain(docstring_format='google', require_parameter_descriptions=True)
def foobar(a: int, b: str, c: dict[str, list[float]]) -> str:
    """Get me foobar.

    Args:
        a: apple pie
        b: banana cake
        c: carrot smoothie
    """
    return f'{a} {b} {c}'


def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse:
    tool = info.function_tools[0]
    print(tool.description)
    #> Get me foobar.
    print(tool.parameters_json_schema)
    """
    {
        'additionalProperties': False,
        'properties': {
            'a': {'description': 'apple pie', 'type': 'integer'},
            'b': {'description': 'banana cake', 'type': 'string'},
            'c': {
                'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'},
                'description': 'carrot smoothie',
                'type': 'object',
            },
        },
        'required': ['a', 'b', 'c'],
        'type': 'object',
    }
    """
    return ModelResponse(parts=[TextPart('foobar')])


agent.run_sync('hello', model=FunctionModel(print_schema))
```

_(This example is complete, it can be run "as is")_

If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object.

Here's an example where we use [`TestModel.last_model_request_parameters`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel.last_model_request_parameters) to inspect the tool schema that would be passed to the model.

single\_parameter\_tool.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

agent = Agent()


class Foobar(BaseModel):
    """This is a Foobar"""

    x: int
    y: str
    z: float = 3.14


@agent.tool_plain
def foobar(f: Foobar) -> str:
    return str(f)


test_model = TestModel()
result = agent.run_sync('hello', model=test_model)
print(result.output)
#> {"foobar":"x=0 y='a' z=3.14"}
print(test_model.last_model_request_parameters.function_tools)
"""
[
    ToolDefinition(
        name='foobar',
        parameters_json_schema={
            'properties': {
                'x': {'type': 'integer'},
                'y': {'type': 'string'},
                'z': {'default': 3.14, 'type': 'number'},
            },
            'required': ['x', 'y'],
            'title': 'Foobar',
            'type': 'object',
        },
        description='This is a Foobar',
    )
]
"""
```

_(This example is complete, it can be run "as is")_

Debugging Tool Calls

Understanding tool behavior is crucial for agent development. By instrumenting your agent with [Logfire](/docs/ai/integrations/logfire), you can see:

-   What arguments were passed to each tool
-   What each tool returned
-   How long each tool took to execute
-   Any errors that occurred

This visibility helps you understand why an agent made specific decisions and identify issues in tool implementations.

## See Also

For more tool features and integrations, see:

-   [Advanced Tool Features](/docs/ai/tools-toolsets/tools-advanced) - Custom schemas, dynamic tools, tool execution and retries
-   [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools
-   [Builtin Tools](/docs/ai/tools-toolsets/builtin-tools) - Native tools provided by LLM providers
-   [Common Tools](/docs/ai/tools-toolsets/common-tools) - Ready-to-use tool implementations
-   [Third-Party Tools](/docs/ai/tools-toolsets/third-party-tools) - Integrations with MCP, LangChain, ACI.dev and other tool libraries
-   [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools) - Tools requiring approval or external execution

---

# [Advanced Tool Features](https://pydantic.dev/docs/ai/tools-toolsets/tools-advanced/)

# Advanced Tool Features

This page covers advanced features for function tools in Pydantic AI. For basic tool usage, see the [Function Tools](/docs/ai/tools-toolsets/tools) documentation.

## Tool Output

Tools can return anything that Pydantic can serialize to JSON, as well as audio, video, image or document content depending on the types of [multi-modal input](/docs/ai/advanced-features/input) the model supports:

function\_tool\_output.py

```python
from datetime import datetime

from pydantic import BaseModel

from pydantic_ai import Agent, DocumentUrl, ImageUrl
from pydantic_ai.models.openai import OpenAIResponsesModel


class User(BaseModel):
    name: str
    age: int


agent = Agent(model=OpenAIResponsesModel('gpt-5.2'))


@agent.tool_plain
def get_current_time() -> datetime:
    return datetime.now()


@agent.tool_plain
def get_user() -> User:
    return User(name='John', age=30)


@agent.tool_plain
def get_company_logo() -> ImageUrl:
    return ImageUrl(url='https://iili.io/3Hs4FMg.png')


@agent.tool_plain
def get_document() -> DocumentUrl:
    return DocumentUrl(url='https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf')


result = agent.run_sync('What time is it?')
print(result.output)
#> The current time is 10:45 PM on April 17, 2025.

result = agent.run_sync('What is the user name?')
print(result.output)
#> The user's name is John.

result = agent.run_sync('What is the company name in the logo?')
print(result.output)
#> The company name in the logo is "Pydantic."

result = agent.run_sync('What is the main content of the document?')
print(result.output)
#> The document contains just the text "Dummy PDF file."
```

_(This example is complete, it can be run "as is")_

Some models (e.g. Gemini) natively support semi-structured return values, while some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON.

### Advanced Tool Returns

For scenarios where you need more control over both the tool's return value and the content sent to the model, you can use [`ToolReturn`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolReturn). This is particularly useful when you want to:

-   Separate the structured return value from additional content sent to the model
-   Explicitly send content as a separate user message (rather than in the tool result)
-   Include additional metadata that shouldn't be sent to the LLM

Here's an example of a computer automation tool that captures screenshots and provides visual feedback:

advanced\_tool\_return.py

```python
from pydantic_ai import Agent, BinaryContent, ToolReturn
from pydantic_ai.models.test import TestModel

agent = Agent(TestModel())

@agent.tool_plain
def click_and_capture(x: int, y: int) -> ToolReturn:
    """Click at coordinates and show before/after screenshots."""
    before_screenshot = BinaryContent(data=b'\x89PNG', media_type='image/png')
    # perform_click(x, y)
    after_screenshot = BinaryContent(data=b'\x89PNG', media_type='image/png')
    return ToolReturn(
        return_value=f'Successfully clicked at ({x}, {y})',
        content=[
            'Before:',
            before_screenshot,
            'After:',
            after_screenshot,
        ],
        metadata={
            'coordinates': {'x': x, 'y': y},
            'action_type': 'click_and_capture',
        },
    )

# The model receives the rich visual content for analysis
# while your application can access the structured return_value and metadata
result = agent.run_sync('Click on the submit button and tell me what happened')
print(result.output)
#> {"click_and_capture":"Successfully clicked at (0, 0)"}
```

-   **`return_value`**: The actual return value used in the tool response. This is what gets serialized and sent back to the model as the tool's result. Can include multimodal content directly (see [Tool Output](#function-tool-output) above).
-   **`content`**: Content sent as a **separate user message** after the tool result. Use this when you explicitly want content to appear outside the tool result, or when combining structured return values with rich content.
-   **`metadata`**: Optional metadata that your application can access but is not sent to the LLM. Useful for logging, debugging, or additional processing. Some other AI frameworks call this feature 'artifacts'.

This separation allows you to provide rich context to the model while maintaining clean, structured return values for your application logic. For multimodal content that should be sent natively in the tool result (when supported by the model), return it directly from the tool function or include it in `return_value` (see [Tool Output](#function-tool-output) above).

## Custom Tool Schema

If you have a function that lacks appropriate documentation (i.e. poorly named, no type information, poor docstring, use of \*args or \*\*kwargs and suchlike) then you can still turn it into a tool that can be effectively used by the agent with the [`Tool.from_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool.from_schema) function. With this you provide the name, description, JSON schema, and whether the function takes a `RunContext` for the function directly:

```python
from pydantic_ai import Agent, Tool
from pydantic_ai.models.test import TestModel


def foobar(**kwargs) -> str:
    return kwargs['a'] + kwargs['b']

tool = Tool.from_schema(
    function=foobar,
    name='sum',
    description='Sum two numbers.',
    json_schema={
        'additionalProperties': False,
        'properties': {
            'a': {'description': 'the first number', 'type': 'integer'},
            'b': {'description': 'the second number', 'type': 'integer'},
        },
        'required': ['a', 'b'],
        'type': 'object',
    },
    takes_ctx=False,
)

test_model = TestModel()
agent = Agent(test_model, tools=[tool])

result = agent.run_sync('testing...')
print(result.output)
#> {"sum":0}
```

Please note that validation of the tool arguments will not be performed, and this will pass all arguments as keyword arguments.

## Dynamic Tools

Tools can optionally be defined with another function: `prepare`, which is called at each step of a run to customize the definition of the tool passed to the model, or omit the tool completely from that step.

A `prepare` method can be registered via the `prepare` kwarg to any of the tool registration mechanisms:

-   [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool) decorator
-   [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain) decorator
-   [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass

The `prepare` method, should be of type [`ToolPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolPrepareFunc), a function which takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a pre-built [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and should either return that `ToolDefinition` with or without modifying it, return a new `ToolDefinition`, or return `None` to indicate this tools should not be registered for that step.

Here's a simple `prepare` method that only includes the tool if the value of the dependency is `42`.

As with the previous example, we use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) to demonstrate the behavior without calling a real model.

tool\_only\_if\_42.py

```python

from pydantic_ai import Agent, RunContext, ToolDefinition

agent = Agent('test')


async def only_if_42(
    ctx: RunContext[int], tool_def: ToolDefinition
) -> ToolDefinition | None:
    if ctx.deps == 42:
        return tool_def


@agent.tool(prepare=only_if_42)
def hitchhiker(ctx: RunContext[int], answer: str) -> str:
    return f'{ctx.deps} {answer}'


result = agent.run_sync('testing...', deps=41)
print(result.output)
#> success (no tool calls)
result = agent.run_sync('testing...', deps=42)
print(result.output)
#> {"hitchhiker":"42 a"}
```

_(This example is complete, it can be run "as is")_

Here's a more complex example where we change the description of the `name` parameter to based on the value of `deps`

For the sake of variation, we create this tool using the [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass.

customize\_name.py

```python
from __future__ import annotations

from typing import Literal

from pydantic_ai import Agent, RunContext, Tool, ToolDefinition
from pydantic_ai.models.test import TestModel


def greet(name: str) -> str:
    return f'hello {name}'


async def prepare_greet(
    ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition
) -> ToolDefinition | None:
    d = f'Name of the {ctx.deps} to greet.'
    tool_def.parameters_json_schema['properties']['name']['description'] = d
    return tool_def


greet_tool = Tool(greet, prepare=prepare_greet)
test_model = TestModel()
agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine'])

result = agent.run_sync('testing...', deps='human')
print(result.output)
#> {"greet":"hello a"}
print(test_model.last_model_request_parameters.function_tools)
"""
[
    ToolDefinition(
        name='greet',
        parameters_json_schema={
            'additionalProperties': False,
            'properties': {
                'name': {'type': 'string', 'description': 'Name of the human to greet.'}
            },
            'required': ['name'],
            'type': 'object',
        },
    )
]
"""
```

_(This example is complete, it can be run "as is")_

### Agent-wide Dynamic Tools

In addition to per-tool `prepare` methods, you can also define an agent-wide `prepare_tools` function. This function is called at each step of a run and allows you to filter or modify the list of all tool definitions available to the agent for that step. This is especially useful if you want to enable or disable multiple tools at once, or apply global logic based on the current context.

The `prepare_tools` function should be of type [`ToolsPrepareFunc`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolsPrepareFunc), which takes the [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and returns a new list of tool definitions (or `None` to disable all tools for that step).

Note

The list of tool definitions passed to `prepare_tools` includes both regular function tools and tools from any [toolsets](/docs/ai/tools-toolsets/toolsets) registered on the agent, but not [output tools](/docs/ai/core-concepts/output#tool-output).

To modify output tools, you can set a `prepare_output_tools` function instead.

Here's an example that makes all tools strict if the model is an OpenAI model:

agent\_prepare\_tools\_customize.py

```python
from dataclasses import replace

from pydantic_ai import Agent, RunContext, ToolDefinition
from pydantic_ai.models.test import TestModel


async def turn_on_strict_if_openai(
    ctx: RunContext[None], tool_defs: list[ToolDefinition]
) -> list[ToolDefinition] | None:
    if ctx.model.system == 'openai':
        return [replace(tool_def, strict=True) for tool_def in tool_defs]
    return tool_defs


test_model = TestModel()
agent = Agent(test_model, prepare_tools=turn_on_strict_if_openai)


@agent.tool_plain
def echo(message: str) -> str:
    return message


agent.run_sync('testing...')
assert test_model.last_model_request_parameters.function_tools[0].strict is None

# Set the system attribute of the test_model to 'openai'
test_model._system = 'openai'

agent.run_sync('testing with openai...')
assert test_model.last_model_request_parameters.function_tools[0].strict
```

_(This example is complete, it can be run "as is")_

Here's another example that conditionally filters out the tools by name if the dependency (`ctx.deps`) is `True`:

agent\_prepare\_tools\_filter\_out.py

```python

from pydantic_ai import Agent, RunContext, Tool, ToolDefinition


def launch_potato(target: str) -> str:
    return f'Potato launched at {target}!'


async def filter_out_tools_by_name(
    ctx: RunContext[bool], tool_defs: list[ToolDefinition]
) -> list[ToolDefinition] | None:
    if ctx.deps:
        return [tool_def for tool_def in tool_defs if tool_def.name != 'launch_potato']
    return tool_defs


agent = Agent(
    'test',
    tools=[Tool(launch_potato)],
    prepare_tools=filter_out_tools_by_name,
    deps_type=bool,
)

result = agent.run_sync('testing...', deps=False)
print(result.output)
#> {"launch_potato":"Potato launched at a!"}
result = agent.run_sync('testing...', deps=True)
print(result.output)
#> success (no tool calls)
```

_(This example is complete, it can be run "as is")_

You can use `prepare_tools` to:

-   Dynamically enable or disable tools based on the current model, dependencies, or other context
-   Modify tool definitions globally (e.g., set all tools to strict mode, change descriptions, etc.)

If both per-tool `prepare` and agent-wide `prepare_tools` are used, the per-tool `prepare` is applied first to each tool, and then `prepare_tools` is called with the resulting list of tool definitions.

## Tool Choice

The `tool_choice` setting in [`ModelSettings`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ModelSettings) controls which tools the model can use during a request. This is useful for disabling tools, forcing tool use, or restricting which tools are available.

Pydantic AI distinguishes between **[function tools](/docs/ai/tools-toolsets/tools)** (tools you register via `@agent.tool`, [toolsets](/docs/ai/tools-toolsets/toolsets), or [MCP](/docs/ai/mcp/client)), and **output tools** (internal tools used for [structured output](/docs/ai/core-concepts/output#tool-output)).

### Options

Value

Description

`'auto'` (default)

Model decides whether to use tools. All tools available.

`'none'`

Disable function tools. Model can respond with text or use output tools.

`'required'`

Force the model to use a function tool. Excludes output tools, so set dynamically via a [capability](#dynamic-tool-choice-via-capabilities) or use [direct model requests](/docs/ai/core-concepts/direct); raises an error when set statically in `agent.run()`.

`['tool_a', ...]`

Restrict to specific tools by name. Excludes output tools -- same dynamic/direct requirement as `'required'`.

[`ToolOrOutput`](/docs/ai/api/pydantic-ai/settings/#pydantic_ai.settings.ToolOrOutput)`(function_tools=['...'])`

Restrict function tools while auto-including all output tools.

### Example

```python
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel
from pydantic_ai.settings import ToolOrOutput

agent = Agent(TestModel())


@agent.tool_plain
def get_weather(city: str) -> str:
    return f'Sunny in {city}'


@agent.tool_plain
def get_time(city: str) -> str:
    return f'12:00 in {city}'


# Pass tool_choice via model_settings
result = agent.run_sync('Hello', model_settings={'tool_choice': 'none'})

# Use ToolOrOutput to restrict to specific function tools while allowing output
result = agent.run_sync(
    'Hello', model_settings={'tool_choice': ToolOrOutput(function_tools=['get_weather'])}
)
```

### Dynamic tool choice via capabilities

`tool_choice='required'` and `['tool_a', ...]` exclude output tools, so setting either one _statically_ would force a tool call on every step and leave the agent unable to produce a final response. `agent.run()` raises a `UserError` when it detects these values on the static baseline (the `model_settings` argument of `Agent.run`, the agent's own `model_settings`, or the underlying model's defaults).

To vary `tool_choice` _per step_ -- for example, to force a specific tool on the first step and then let the model decide -- return a callable from a capability's [`get_model_settings`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.AbstractCapability.get_model_settings). The callable receives a [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) with full access to `ctx.messages` and `ctx.run_step`, so it can inspect what has already happened in the run and adapt.

force\_first\_call.py

```python
from pydantic_ai import Agent, ModelSettings, RunContext
from pydantic_ai.capabilities import AbstractCapability
from pydantic_ai.messages import ModelRequest, ToolReturnPart


class RequireFirstCall(AbstractCapability[None]):
    """Force `tool_name` to be called successfully before anything else."""

    def __init__(self, tool_name: str) -> None:
        self.tool_name = tool_name

    def get_model_settings(self):
        def settings(ctx: RunContext[None]) -> ModelSettings:
            called = any(
                isinstance(part, ToolReturnPart) and part.tool_name == self.tool_name
                for message in ctx.messages
                if isinstance(message, ModelRequest)
                for part in message.parts
            )
            if called:
                return ModelSettings()
            return ModelSettings(tool_choice=[self.tool_name])

        return settings


agent = Agent('openai:gpt-5.2', capabilities=[RequireFirstCall('get_weather')])


@agent.tool_plain
def get_weather(city: str) -> str:
    return f'Sunny in {city}'
```

Because capability-supplied settings are resolved per step, the callable's returned `tool_choice` is trusted to change across steps and is not rejected by the baseline validator. For a single model request without an agent loop, use [`pydantic_ai.direct.model_request`](/docs/ai/api/pydantic-ai/direct/#pydantic_ai.direct.model_request) instead.

### Provider Support

All providers support `'auto'` and `'none'`. Key differences for other options:

Provider

`'required'`

Specific tools

Notes

OpenAI

✓

✓

Full support

Anthropic

⚠️

⚠️

Not supported with thinking enabled

Google

✓

✓

Bedrock

✓

Single only

Multiple tools fall back to 'any' mode

Groq/HuggingFace

✓

Single only

Multiple tools fall back to 'required' mode

Mistral

✓

✓

Maps `'required'` to `'any'` mode

xAI

✓

✓

Some models may not support forcing; falls back to 'auto'

### Prompt caching implications

Restricting the available tool set via `tool_choice` can invalidate provider prompt caches because most provider APIs cache on the full tools array. Pydantic AI restricts the tool set in two ways:

-   **API-level filtering** (cache-preserving): the full tools array is sent and the provider is told to only allow a subset. Used by OpenAI Responses (`allowed_tools`), Google (`allowed_function_names`), and Bedrock when forcing a single tool.
-   **Client-side filtering** (breaks cache): the tools array is trimmed before the request. Used when the provider API has no native filter for the given case.

The table below covers the cases where Pydantic AI must filter client-side and therefore breaks cache:

Provider

Cache-breaking case

Anthropic

`tool_choice` is a list of multiple tools, OR a single tool with thinking enabled

OpenAI Chat

`tool_choice` is a list of multiple tools, OR a single tool on a model that doesn't support forcing

Bedrock

`tool_choice` is a list of multiple tools

Groq / HuggingFace

`tool_choice` is a list of multiple tools

Mistral

`tool_choice` is a list (any size) -- the API doesn't accept specific tool names

xAI

`tool_choice` is a list of multiple tools, OR a single tool on a model that doesn't support forcing

OpenAI Responses

Never -- `allowed_tools` handles all cases natively

Google

Never -- `allowed_function_names` handles all cases natively

If preserving cache hits matters, prefer providers/cases marked "Never", or use `ToolOrOutput` (which keeps the full set) instead of a restrictive list.

## Tool Execution and Retries

When a tool is executed, its arguments (provided by the LLM) are first validated against the function's signature using Pydantic (with optional [validation context](/docs/ai/core-concepts/output#validation-context)). If validation fails (e.g., due to incorrect types or missing required arguments), a `ValidationError` is raised, and the framework automatically generates a [`RetryPromptPart`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.RetryPromptPart) containing the validation details. This prompt is sent back to the LLM, informing it of the error and allowing it to correct the parameters and retry the tool call.

Beyond automatic validation errors, the tool's own internal logic can also explicitly request a retry by raising the [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception. This is useful for situations where the parameters were technically valid, but an issue occurred during execution (like a transient network error, or the tool determining the initial attempt needs modification).

```python
from pydantic_ai import ModelRetry


def my_flaky_tool(query: str) -> str:
    if query == 'bad':
        # Tell the LLM the query was bad and it should try again
        raise ModelRetry("The query 'bad' is not allowed. Please provide a different query.")
    # ... process query ...
    return 'Success!'
```

Raising `ModelRetry` also generates a `RetryPromptPart` containing the exception message, which is sent back to the LLM to guide its next attempt. Both `ValidationError` and `ModelRetry` respect the configured retry limit -- set per-tool via [`Tool(max_retries=N)`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) (or `@agent.tool(retries=N)`), per-toolset via [`FunctionToolset(max_retries=N)`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset), or agent-wide via [`Agent(tool_retries=N)`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), applied in that order of precedence.

Tool retries are tracked **per tool**: every function tool has its own counter, with no global 'tool call' budget shared across the run. When a tool raises `ModelRetry` or its arguments fail validation, only that tool's counter advances. Inside a tool function, [`ctx.max_retries`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.max_retries) reflects that tool's enforcement limit and [`ctx.retry`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext.retry) is that tool's own counter. When a tool exhausts its counter, the run raises [`UnexpectedModelBehavior`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.UnexpectedModelBehavior) with message `'Tool {name!r} exceeded max retries count of {N}'`. User-provided toolsets inherit `Agent(tool_retries=...)` as their default when no per-toolset value is set.

### Tool Timeout

You can set a timeout for tool execution to prevent tools from running indefinitely. If a tool exceeds its timeout, it is treated as a failure and a retry prompt is sent to the model (counting towards the retry limit).

```python
import asyncio

from pydantic_ai import Agent

# Set a default timeout for all tools on the agent
agent = Agent('test', tool_timeout=30)


@agent.tool_plain
async def slow_tool() -> str:
    """This tool will use the agent's default timeout (30 seconds)."""
    await asyncio.sleep(10)
    return 'Done'


@agent.tool_plain(timeout=5)
async def fast_tool() -> str:
    """This tool has its own timeout (5 seconds) that overrides the agent default."""
    await asyncio.sleep(1)
    return 'Done'
```

-   **Agent-level timeout**: Set `tool_timeout` on the [`Agent`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent) to apply a default timeout to all tools.
-   **Per-tool timeout**: Set `timeout` on individual tools via [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain), or the [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) dataclass. This overrides the agent-level default.

When a timeout occurs, the tool is considered to have failed and the model receives a retry prompt with the message `"Timed out after {timeout} seconds."`. This counts towards the tool's retry limit just like validation errors or explicit [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exceptions.

### Custom Args Validator

The `args_validator` parameter lets you define custom validation that runs after Pydantic schema validation but before the tool executes. This is useful for business logic validation, cross-field validation, or validating arguments before requesting [human approval](/docs/ai/tools-toolsets/deferred-tools) for deferred tools.

The validator receives [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) as its first argument, followed by the same parameters as the tool function. Return `None` on success, or raise [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) on failure.

args\_validator\_approval.py

```python
from pydantic_ai import Agent, DeferredToolRequests, ModelRetry, RunContext

agent = Agent('test', deps_type=int, output_type=[str, DeferredToolRequests])


def validate_sum_limit(ctx: RunContext[int], x: int, y: int) -> None:
    """Validate that the sum doesn't exceed the limit from deps."""
    if x + y > ctx.deps:
        raise ModelRetry(f'Sum of x and y must not exceed {ctx.deps}')


# Validation runs *before* approval is requested, so the model can
# fix bad args without bothering the user.
@agent.tool(requires_approval=True, args_validator=validate_sum_limit)
def add_numbers(ctx: RunContext[int], x: int, y: int) -> int:
    """Add two numbers (sum must not exceed the configured limit)."""
    return x + y


result = agent.run_sync('add 5 and 3', deps=100)
assert isinstance(result.output, DeferredToolRequests)
# The validated args are ready for the user to approve
print(result.output.approvals[0].args)
#> {'x': 0, 'y': 0}
```

_(This example is complete, it can be run "as is")_

When validation fails, the error message is sent back to the LLM as a retry prompt. This respects the `retries` setting on the tool. For [deferred tools](/docs/ai/tools-toolsets/deferred-tools), validation runs at deferral time -- only tool calls with valid arguments are deferred, while failed validation triggers a retry just like regular tools.

The `args_validator` parameter is available on [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain), [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool), [`Tool.from_schema`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool.from_schema), and [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset). Validators can be sync or async functions.

The validation result is exposed via the `args_valid` field on [`FunctionToolCallEvent`](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.FunctionToolCallEvent). This reflects all validation -- both schema validation and custom `args_validator` validation (if configured): `True` means all validation passed, `False` means validation failed, and `None` means validation was not performed (e.g. tool calls skipped due to the `'early'` end strategy, or deferred tool calls resolved without execution).

### Parallel tool calls & concurrency

When a model returns multiple tool calls in one response, Pydantic AI schedules them concurrently using `asyncio.create_task`. If a tool requires sequential/serial execution, you can pass the [`sequential`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.sequential) flag when registering the tool, or wrap the agent run in the [`with agent.parallel_tool_call_execution_mode('sequential')`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.parallel_tool_call_execution_mode) context manager.

Async functions are run on the event loop, while sync functions are offloaded to threads. To get the best performance, _always_ use an async function _unless_ you're doing blocking I/O (and there's no way to use a non-blocking library instead) or CPU-bound work (like `numpy` or `scikit-learn` operations), so that simple functions are not offloaded to threads unnecessarily.

#### Thread executor for long-running servers

By default, sync functions are offloaded to threads using `anyio.to_thread.run_sync`, which creates ephemeral threads on demand. In long-running servers (e.g. FastAPI), these threads can accumulate under sustained traffic, leading to memory growth.

To control thread lifecycle, provide a bounded [`ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) using the [`ThreadExecutor`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.ThreadExecutor) capability (per-agent) or the [`Agent.using_thread_executor()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.using_thread_executor) context manager (global):

```python
from concurrent.futures import ThreadPoolExecutor
from contextlib import asynccontextmanager

from pydantic_ai import Agent
from pydantic_ai.capabilities import ThreadExecutor

# Per-agent: pass as a capability
executor = ThreadPoolExecutor(max_workers=16, thread_name_prefix='agent-worker')
agent = Agent('openai:gpt-5.2', capabilities=[ThreadExecutor(executor)])

# Global: wrap your server lifespan
@asynccontextmanager
async def lifespan(app):
    executor = ThreadPoolExecutor(max_workers=16)
    with Agent.using_thread_executor(executor):
        yield
    executor.shutdown(wait=True)
```

Limiting tool executions

You can cap tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](/docs/ai/core-concepts/agent#usage-limits). The counter increments only after a successful tool invocation. Output tools (used for [structured output](/docs/ai/core-concepts/output)) are not counted in the `tool_calls` metric.

#### Output Tool Calls

When a model calls an [output tool](/docs/ai/core-concepts/output#tool-output) in parallel with other tools, the agent's [`end_strategy`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.end_strategy) parameter controls how these tool calls are executed. The `'graceful'` strategy ensures all function tools are executed even after a final result is found, while skipping remaining output tools. The `'exhaustive'` strategy goes further and also executes all output tools. Both are useful when tools have side effects (like logging, sending notifications, or updating metrics) that should always execute.

For more information on how `end_strategy` works with both function tools and output tools, see the [Output Tool](/docs/ai/core-concepts/output#parallel-output-tool-calls) docs.

## Tool Search

Agents with many tools (e.g. [MCP servers](/docs/ai/mcp/client) exposing dozens of endpoints) can suffer from context bloat and degraded tool selection. Marking tools for deferred loading hides them from the model's initial context; a `search_tools` tool is automatically injected so the model can discover hidden tools by keyword when it needs them.

This is inspired by Anthropic's [Tool Search Tool](https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool#limits-and-best-practices) for managing large tool collections. Tool search is implemented on the Pydantic AI side and works with any model. Native provider support is planned in [#4167](https://github.com/pydantic/pydantic-ai/issues/4167).

For individual tools, set `defer_loading=True` on [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool), [`@agent.tool`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool), or [`@agent.tool_plain`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.tool_plain). For entire toolsets (including [MCP servers](/docs/ai/mcp/client) and [`FastMCPToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.fastmcp.FastMCPToolset)), use the [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) method -- pass a list of tool names to hide only specific tools, or `None` to hide all.

tool\_search.py

```python
from pydantic_ai import Agent

agent = Agent('openai:gpt-5.2')


@agent.tool_plain(defer_loading=True)
def mortgage_calculator(principal: float, rate: float, years: int) -> str:
    """Calculate monthly mortgage payment for a home loan."""
    monthly_rate = rate / 100 / 12
    n_payments = years * 12
    payment = principal * (monthly_rate * (1 + monthly_rate) ** n_payments) / ((1 + monthly_rate) ** n_payments - 1)
    return f'${payment:.2f}/month'
```

For MCP servers, use [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) to hide all tools behind search:

tool\_search\_mcp.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerHTTP

mcp = MCPServerHTTP('http://localhost:8000/mcp')
agent = Agent('openai:gpt-5.2', toolsets=[mcp.defer_loading()])
```

Tool discovery and message history

Discovered tools are tracked via metadata in the [message history](/docs/ai/core-concepts/message-history). If a [history processor](/docs/ai/core-concepts/message-history#processing-message-history) truncates messages containing discovery metadata, previously discovered tools will require re-discovery.

See [`ToolDefinition.defer_loading`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition.defer_loading) and [Deferred Loading](/docs/ai/tools-toolsets/toolsets#deferred-loading) for more details.

## See Also

-   [Function Tools](/docs/ai/tools-toolsets/tools) - Basic tool concepts and registration
-   [Toolsets](/docs/ai/tools-toolsets/toolsets) - Managing collections of tools
-   [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools) - Tools requiring approval or external execution
-   [Third-Party Tools](/docs/ai/tools-toolsets/third-party-tools) - Integrations with external tool libraries

---

# [Toolsets](https://pydantic.dev/docs/ai/tools-toolsets/toolsets/)

# Toolsets

A toolset represents a collection of [tools](/docs/ai/tools-toolsets/tools) that can be registered with an agent in one go. They can be reused by different agents, swapped out at runtime or during testing, and composed in order to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. A toolset can contain locally defined functions, depend on an external service to provide them, or implement custom logic to list available tools and handle them being called. Toolsets can also be provided via [capabilities](/docs/ai/core-concepts/capabilities), which bundle tools with hooks, instructions, and model settings.

Toolsets are used (among many other things) to define [MCP servers](/docs/ai/mcp/client) available to an agent. Pydantic AI includes many kinds of toolsets which are described below, and you can define a [custom toolset](#building-a-custom-toolset) by inheriting from the [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) class.

The toolsets that will be available during an agent run can be specified in four different ways:

-   at agent construction time, via the [`toolsets`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__) keyword argument to `Agent`, which takes toolset instances as well as functions that generate toolsets [dynamically](#dynamically-building-a-toolset) based on the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   at agent run time, via the `toolsets` keyword argument to [`agent.run()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run), [`agent.run_sync()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_sync), [`agent.run_stream()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.AbstractAgent.run_stream), or [`agent.iter()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter). These toolsets will be additional to those registered on the `Agent`
-   [dynamically](#dynamically-building-a-toolset), via the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator which lets you build a toolset based on the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   as a contextual override, via the `toolsets` keyword argument to the [`agent.override()`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.iter) context manager. These toolsets will replace those provided at agent construction or run time during the life of the context manager

toolsets.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel


def agent_tool():
  return "I'm registered directly on the agent"


def extra_tool():
  return "I'm passed as an extra tool for a specific run"


def override_tool():
  return 'I override all other tools'


agent_toolset = FunctionToolset(tools=[agent_tool]) # (1)
extra_toolset = FunctionToolset(tools=[extra_tool])
override_toolset = FunctionToolset(tools=[override_tool])

test_model = TestModel() # (2)
agent = Agent(test_model, toolsets=[agent_toolset])

result = agent.run_sync('What tools are available?')
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['agent_tool']

result = agent.run_sync('What tools are available?', toolsets=[extra_toolset])
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['agent_tool', 'extra_tool']

with agent.override(toolsets=[override_toolset]):
  result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) # (3)
  print([t.name for t in test_model.last_model_request_parameters.function_tools])
  #> ['override_tool']
```

The [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) will be explained in detail in the next section.

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

This `extra_toolset` will be ignored because we're inside an override context.

_(This example is complete, it can be run "as is")_

## Function Toolset

As the name suggests, a [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) makes locally defined functions available as tools.

Functions can be added as tools in four different ways:

-   via the [`@toolset.tool`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool) decorator -- for tools that need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   via the [`@toolset.tool_plain`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.tool_plain) decorator -- for tools that do not need access to the agent [context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)
-   via the [`tools`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.__init__) keyword argument to the constructor which can take either plain functions, or instances of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool)
-   via the [`toolset.add_function()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_function) and [`toolset.add_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.add_tool) methods which can take a plain function or an instance of [`Tool`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.Tool) respectively

The `add_function()` and `add_tool()` methods can also be used from a tool function to dynamically register new tools during a run to be available in future run steps.

function\_toolset.py

```python
from datetime import datetime

from pydantic_ai import Agent, FunctionToolset, RunContext
from pydantic_ai.models.test import TestModel


def temperature_celsius(city: str) -> float:
  return 21.0


def temperature_fahrenheit(city: str) -> float:
  return 69.8


weather_toolset = FunctionToolset(tools=[temperature_celsius, temperature_fahrenheit])


@weather_toolset.tool
def conditions(ctx: RunContext, city: str) -> str:
  if ctx.run_step % 2 == 0:
      return "It's sunny"
  else:
      return "It's raining"


datetime_toolset = FunctionToolset()
datetime_toolset.add_function(lambda: datetime.now(), name='now')

test_model = TestModel()  # (1)
agent = Agent(test_model)

result = agent.run_sync('What tools are available?', toolsets=[weather_toolset])
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['temperature_celsius', 'temperature_fahrenheit', 'conditions']

result = agent.run_sync('What tools are available?', toolsets=[datetime_toolset])
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['now']
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

_(This example is complete, it can be run "as is")_

### Toolset Instructions

A [`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) can provide instructions that are automatically included in the model request. This lets each toolset carry its own usage guidance alongside its tools, so you don't need to duplicate instructions on every agent that uses the toolset.

Instructions can be provided as strings, functions (sync or async, with or without [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext)), or a mix of both:

toolset\_instructions.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel

search_toolset = FunctionToolset(
    instructions='Always use the search tool before answering factual questions.',
)


@search_toolset.tool_plain
def search(query: str) -> str:
    """Search for information."""
    return f'Results for: {query}'


test_model = TestModel()
agent = Agent(test_model, toolsets=[search_toolset])
result = agent.run_sync('What is the capital of France?')
print(result.all_messages()[0].instructions)
#> Always use the search tool before answering factual questions.
```

_(This example is complete, it can be run "as is")_

You can also use the [`@toolset.instructions`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset.instructions) decorator to register dynamic instruction functions that can access the run context:

toolset\_instructions\_decorator.py

```python
from pydantic_ai import Agent, FunctionToolset, RunContext
from pydantic_ai.models.test import TestModel

math_toolset = FunctionToolset[str]()


@math_toolset.instructions
def math_instructions(ctx: RunContext[str]) -> str:
    return f'You are helping: {ctx.deps}. Always show your work when using the calculator.'


@math_toolset.tool_plain
def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    return '4'


test_model = TestModel()
agent = Agent(test_model, toolsets=[math_toolset], deps_type=str)
result = agent.run_sync('What is 2+2?', deps='Alice')
print(result.all_messages()[0].instructions)
#> You are helping: Alice. Always show your work when using the calculator.
```

_(This example is complete, it can be run "as is")_

When a toolset with instructions is used alongside agent-level [`instructions`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.__init__), the toolset instructions are appended after the agent instructions:

toolset\_instructions\_combined.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel

toolset = FunctionToolset(instructions='Use the greeting tool for all greetings.')


@toolset.tool_plain
def greeting(name: str) -> str:
    """Greet someone."""
    return f'Hello, {name}!'


test_model = TestModel()
agent = Agent(
    test_model,
    instructions='You are a friendly assistant.',
    toolsets=[toolset],
)
result = agent.run_sync('Hi there!')
print(result.all_messages()[0].instructions)
"""
You are a friendly assistant.

Use the greeting tool for all greetings.
"""
```

_(This example is complete, it can be run "as is")_

When multiple toolsets with instructions are registered on an agent, all their instructions are combined:

toolset\_instructions\_multiple.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel

weather_toolset = FunctionToolset(instructions='Use weather tools for forecasts.')


@weather_toolset.tool_plain
def forecast(city: str) -> str:
    """Get weather forecast."""
    return 'Sunny'


calendar_toolset = FunctionToolset(instructions='Use calendar tools for scheduling.')


@calendar_toolset.tool_plain
def schedule(event: str) -> str:
    """Schedule an event."""
    return 'Scheduled'


test_model = TestModel()
agent = Agent(test_model, toolsets=[weather_toolset, calendar_toolset])
result = agent.run_sync('Plan my day')
print(result.all_messages()[0].instructions)
"""
Use weather tools for forecasts.

Use calendar tools for scheduling.
"""
```

_(This example is complete, it can be run "as is")_

## Toolset Composition

Toolsets can be composed to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. Multiple toolsets can also be combined into one.

### Combining Toolsets

[`CombinedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.CombinedToolset) takes a list of toolsets and lets them be used as one.

combined\_toolset.py

```python
from pydantic_ai import Agent, CombinedToolset
from pydantic_ai.models.test import TestModel

from function_toolset import datetime_toolset, weather_toolset

combined_toolset = CombinedToolset([weather_toolset, datetime_toolset])

test_model = TestModel() # (1)
agent = Agent(test_model, toolsets=[combined_toolset])
result = agent.run_sync('What tools are available?')
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['temperature_celsius', 'temperature_fahrenheit', 'conditions', 'now']
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

_(This example is complete, it can be run "as is")_

### Filtering Tools

[`FilteredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FilteredToolset) wraps a toolset and filters available tools ahead of each step of the run based on a user-defined function that is passed the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and each tool's [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) and returns a boolean to indicate whether or not a given tool should be available.

To easily chain different modifications, you can also call [`filtered()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.filtered) on any toolset instead of directly constructing a `FilteredToolset`.

filtered\_toolset.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

from combined_toolset import combined_toolset

filtered_toolset = combined_toolset.filtered(lambda ctx, tool_def: 'fahrenheit' not in tool_def.name)

test_model = TestModel() # (1)
agent = Agent(test_model, toolsets=[filtered_toolset])
result = agent.run_sync('What tools are available?')
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['weather_temperature_celsius', 'weather_conditions', 'datetime_now']
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

_(This example is complete, it can be run "as is")_

### Prefixing Tool Names

[`PrefixedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PrefixedToolset) wraps a toolset and adds a prefix to each tool name to prevent tool name conflicts between different toolsets.

To easily chain different modifications, you can also call [`prefixed()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.prefixed) on any toolset instead of directly constructing a `PrefixedToolset`.

combined\_toolset.py

```python
from pydantic_ai import Agent, CombinedToolset
from pydantic_ai.models.test import TestModel

from function_toolset import datetime_toolset, weather_toolset

combined_toolset = CombinedToolset(
  [
      weather_toolset.prefixed('weather'),
      datetime_toolset.prefixed('datetime')
  ]
)

test_model = TestModel() # (1)
agent = Agent(test_model, toolsets=[combined_toolset])
result = agent.run_sync('What tools are available?')
print([t.name for t in test_model.last_model_request_parameters.function_tools])
"""
[
  'weather_temperature_celsius',
  'weather_temperature_fahrenheit',
  'weather_conditions',
  'datetime_now',
]
"""
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

_(This example is complete, it can be run "as is")_

### Renaming Tools

[`RenamedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.RenamedToolset) wraps a toolset and lets you rename tools using a dictionary mapping new names to original names. This is useful when the names provided by a toolset are ambiguous or would conflict with tools defined by other toolsets, but [prefixing them](#prefixing-tool-names) creates a name that is unnecessarily long or could be confusing to the model.

To easily chain different modifications, you can also call [`renamed()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.renamed) on any toolset instead of directly constructing a `RenamedToolset`.

renamed\_toolset.py

```python
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel

from combined_toolset import combined_toolset

renamed_toolset = combined_toolset.renamed(
  {
      'current_time': 'datetime_now',
      'temperature_celsius': 'weather_temperature_celsius',
      'temperature_fahrenheit': 'weather_temperature_fahrenheit'
  }
)

test_model = TestModel() # (1)
agent = Agent(test_model, toolsets=[renamed_toolset])
result = agent.run_sync('What tools are available?')
print([t.name for t in test_model.last_model_request_parameters.function_tools])
"""
['temperature_celsius', 'temperature_fahrenheit', 'weather_conditions', 'current_time']
"""
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

_(This example is complete, it can be run "as is")_

### Dynamic Tool Definitions

[`PreparedToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.PreparedToolset) lets you modify the entire list of available tools ahead of each step of the agent run using a user-defined function that takes the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and a list of [`ToolDefinition`s](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) and returns a list of modified `ToolDefinition`s.

This is the toolset-specific equivalent of the [`prepare_tools`](/docs/ai/tools-toolsets/tools-advanced#prepare-tools) argument to `Agent` that prepares all tool definitions registered on an agent across toolsets.

Note that it is not possible to add or rename tools using `PreparedToolset`. Instead, you can use [`FunctionToolset.add_function()`](#function-toolset) or [`RenamedToolset`](#renaming-tools).

To easily chain different modifications, you can also call [`prepared()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.prepared) on any toolset instead of directly constructing a `PreparedToolset`.

prepared\_toolset.py

```python
from dataclasses import replace

from pydantic_ai import Agent, RunContext, ToolDefinition
from pydantic_ai.models.test import TestModel

from renamed_toolset import renamed_toolset

descriptions = {
  'temperature_celsius': 'Get the temperature in degrees Celsius',
  'temperature_fahrenheit': 'Get the temperature in degrees Fahrenheit',
  'weather_conditions': 'Get the current weather conditions',
  'current_time': 'Get the current time',
}

async def add_descriptions(ctx: RunContext, tool_defs: list[ToolDefinition]) -> list[ToolDefinition] | None:
  return [
      replace(tool_def, description=description)
      if (description := descriptions.get(tool_def.name, None))
      else tool_def
      for tool_def
      in tool_defs
  ]

prepared_toolset = renamed_toolset.prepared(add_descriptions)

test_model = TestModel() # (1)
agent = Agent(test_model, toolsets=[prepared_toolset])
result = agent.run_sync('What tools are available?')
print(test_model.last_model_request_parameters.function_tools)
"""
[
  ToolDefinition(
      name='temperature_celsius',
      parameters_json_schema={
          'additionalProperties': False,
          'properties': {'city': {'type': 'string'}},
          'required': ['city'],
          'type': 'object',
      },
      description='Get the temperature in degrees Celsius',
  ),
  ToolDefinition(
      name='temperature_fahrenheit',
      parameters_json_schema={
          'additionalProperties': False,
          'properties': {'city': {'type': 'string'}},
          'required': ['city'],
          'type': 'object',
      },
      description='Get the temperature in degrees Fahrenheit',
  ),
  ToolDefinition(
      name='weather_conditions',
      parameters_json_schema={
          'additionalProperties': False,
          'properties': {'city': {'type': 'string'}},
          'required': ['city'],
          'type': 'object',
      },
      description='Get the current weather conditions',
  ),
  ToolDefinition(
      name='current_time',
      parameters_json_schema={
          'additionalProperties': False,
          'properties': {},
          'type': 'object',
      },
      description='Get the current time',
  ),
]
"""
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

### Requiring Tool Approval

[`ApprovalRequiredToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ApprovalRequiredToolset) wraps a toolset and lets you dynamically [require approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) for a given tool call based on a user-defined function that is passed the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext), the tool's [`ToolDefinition`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition), and the validated tool call arguments. If no function is provided, all tool calls will require approval.

To easily chain different modifications, you can also call [`approval_required()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.approval_required) on any toolset instead of directly constructing a `ApprovalRequiredToolset`.

See the [Human-in-the-Loop Tool Approval](/docs/ai/tools-toolsets/deferred-tools#human-in-the-loop-tool-approval) documentation for more information on how to handle agent runs that call tools that require approval and how to pass in the results.

approval\_required\_toolset.py

```python
from pydantic_ai import Agent, DeferredToolRequests, DeferredToolResults
from pydantic_ai.models.test import TestModel

from prepared_toolset import prepared_toolset

approval_required_toolset = prepared_toolset.approval_required(lambda ctx, tool_def, tool_args: tool_def.name.startswith('temperature'))

test_model = TestModel(call_tools=['temperature_celsius', 'temperature_fahrenheit']) # (1)
agent = Agent(
  test_model,
  toolsets=[approval_required_toolset],
  output_type=[str, DeferredToolRequests],
)
result = agent.run_sync('Call the temperature tools')
messages = result.all_messages()
print(result.output)
"""
DeferredToolRequests(
  calls=[],
  approvals=[
      ToolCallPart(
          tool_name='temperature_celsius',
          args={'city': 'a'},
          tool_call_id='pyd_ai_tool_call_id__temperature_celsius',
      ),
      ToolCallPart(
          tool_name='temperature_fahrenheit',
          args={'city': 'a'},
          tool_call_id='pyd_ai_tool_call_id__temperature_fahrenheit',
      ),
  ],
  metadata={},
)
"""

result = agent.run_sync(
  message_history=messages,
  deferred_tool_results=DeferredToolResults(
      approvals={
          'pyd_ai_tool_call_id__temperature_celsius': True,
          'pyd_ai_tool_call_id__temperature_fahrenheit': False,
      }
  )
)
print(result.output)
#> {"temperature_celsius":21.0,"temperature_fahrenheit":"The tool call was denied."}
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to specify which tools to call.

_(This example is complete, it can be run "as is")_

### Deferred Loading

[`DeferredLoadingToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.DeferredLoadingToolset) wraps a toolset and marks its tools for deferred loading, hiding them from the model until discovered via [tool search](/docs/ai/tools-toolsets/tools-advanced#tool-search). This is useful for large toolsets (e.g. MCP servers with many endpoints) where loading all tool definitions into the model's context would be wasteful.

[`FunctionToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.FunctionToolset) also accepts `defer_loading=True` in its constructor to mark all tools for deferred loading. For other toolsets, call [`.defer_loading()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.defer_loading) -- pass a list of tool names to hide only specific tools, or `None` (the default) to hide all.

deferred\_loading\_toolset.py

```python
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerHTTP

mcp = MCPServerHTTP('http://localhost:8000/mcp')
agent = Agent('openai:gpt-5.2', toolsets=[mcp.defer_loading()])
```

### Including Return Schemas

[`IncludeReturnSchemasToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.IncludeReturnSchemasToolset) wraps a toolset and sets `include_return_schema=True` on all its tools, causing the model to receive return type information. For models that natively support return schemas (e.g. Google Gemini), the schema is passed as a structured API field. For other models, it is injected into the tool description as JSON text.

To easily chain different modifications, you can also call [`.include_return_schemas()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.include_return_schemas) on any toolset instead of directly constructing an `IncludeReturnSchemasToolset`.

include\_return\_schemas\_toolset.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel


def get_temperature(city: str) -> float:
    """Get the temperature for a city."""
    return 21.0


toolset = FunctionToolset(tools=[get_temperature])

test_model = TestModel()
agent = Agent(test_model, toolsets=[toolset.include_return_schemas()])
result = agent.run_sync('What is the temperature?')
params = test_model.last_model_request_parameters
assert params is not None
assert params.function_tools[0].include_return_schema is True
```

_(This example is complete, it can be run "as is")_

This is the toolset-level equivalent of the [`IncludeToolReturnSchemas`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.IncludeToolReturnSchemas) capability, which applies across all toolsets or a selected subset.

### Setting Tool Metadata

[`SetMetadataToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.SetMetadataToolset) wraps a toolset and merges metadata key-value pairs onto all its tools. This is useful for tagging tools with configuration that other capabilities or custom logic can inspect.

To easily chain different modifications, you can also call [`.with_metadata()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.with_metadata) on any toolset instead of directly constructing a `SetMetadataToolset`.

set\_metadata\_toolset.py

```python
from pydantic_ai import Agent, FunctionToolset
from pydantic_ai.models.test import TestModel


def search(query: str) -> str:
    """Search for information."""
    return f'Results for: {query}'


toolset = FunctionToolset(tools=[search])

test_model = TestModel()
agent = Agent(test_model, toolsets=[toolset.with_metadata(sensitive=True)])
result = agent.run_sync('Search for something')
params = test_model.last_model_request_parameters
assert params is not None
assert params.function_tools[0].metadata is not None
assert params.function_tools[0].metadata['sensitive'] is True
```

_(This example is complete, it can be run "as is")_

This is the toolset-level equivalent of the [`SetToolMetadata`](/docs/ai/api/pydantic-ai/capabilities/#pydantic_ai.capabilities.SetToolMetadata) capability, which applies across all toolsets or a selected subset.

### Changing Tool Execution

[`WrapperToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.WrapperToolset) wraps another toolset and delegates all responsibility to it.

It is a no-op by default, but you can subclass `WrapperToolset` to change the wrapped toolset's tool execution behavior by overriding the [`call_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.call_tool) method.

logging\_toolset.py

```python
import asyncio

from typing_extensions import Any

from pydantic_ai import Agent, RunContext, ToolsetTool, WrapperToolset
from pydantic_ai.models.test import TestModel

from prepared_toolset import prepared_toolset

LOG = []

class LoggingToolset(WrapperToolset):
  async def call_tool(self, name: str, tool_args: dict[str, Any], ctx: RunContext, tool: ToolsetTool) -> Any:
      LOG.append(f'Calling tool {name!r} with args: {tool_args!r}')
      try:
          await asyncio.sleep(0.1 * len(LOG)) # (1)

          result = await super().call_tool(name, tool_args, ctx, tool)
          LOG.append(f'Finished calling tool {name!r} with result: {result!r}')
      except Exception as e:
          LOG.append(f'Error calling tool {name!r}: {e}')
          raise e
      else:
          return result


logging_toolset = LoggingToolset(prepared_toolset)

agent = Agent(TestModel(), toolsets=[logging_toolset]) # (2)
result = agent.run_sync('Call all the tools')
print(LOG)
"""
[
  "Calling tool 'temperature_celsius' with args: {'city': 'a'}",
  "Calling tool 'temperature_fahrenheit' with args: {'city': 'a'}",
  "Calling tool 'weather_conditions' with args: {'city': 'a'}",
  "Calling tool 'current_time' with args: {}",
  "Finished calling tool 'temperature_celsius' with result: 21.0",
  "Finished calling tool 'temperature_fahrenheit' with result: 69.8",
  'Finished calling tool 'weather_conditions' with result: "It's raining"',
  "Finished calling tool 'current_time' with result: datetime.datetime(...)",
]
"""
```

All docs examples are tested in CI and their their output is verified, so we need `LOG` to always have the same order whenever this code is run. Since the tools could finish in any order, we sleep an increasing amount of time based on which number tool call we are to ensure that they finish (and log) in the same order they were called in.

We use [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here as it will automatically call each tool.

_(This example is complete, it can be run "as is")_

## External Toolset

If your agent needs to be able to call [external tools](/docs/ai/tools-toolsets/deferred-tools#external-tool-execution) that are provided and executed by an upstream service or frontend, you can build an [`ExternalToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.ExternalToolset) from a list of [`ToolDefinition`s](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.ToolDefinition) containing the tool names, arguments JSON schemas, and descriptions.

When the model calls an external tool, the call is considered to be ["deferred"](/docs/ai/tools-toolsets/deferred-tools#deferred-tools), and the agent run will end with a [`DeferredToolRequests`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolRequests) output object with a `calls` list holding [`ToolCallPart`s](/docs/ai/api/pydantic-ai/messages/#pydantic_ai.messages.ToolCallPart) containing the tool name, validated arguments, and a unique tool call ID, which are expected to be passed to the upstream service or frontend that will produce the results.

When the tool call results are received from the upstream service or frontend, you can build a [`DeferredToolResults`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.DeferredToolResults) object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](/docs/ai/tools-toolsets/tools-advanced#advanced-tool-returns) object, or a [`ModelRetry`](/docs/ai/api/pydantic-ai/exceptions/#pydantic_ai.exceptions.ModelRetry) exception in case the tool call failed and the model should [try again](/docs/ai/tools-toolsets/tools-advanced#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](/docs/ai/core-concepts/message-history).

Note that you need to add `DeferredToolRequests` to the `Agent`'s or `agent.run()`'s [`output_type`](/docs/ai/core-concepts/output#structured-output) so that the possible types of the agent run output are correctly inferred. For more information, see the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation.

To demonstrate, let us first define a simple agent _without_ deferred tools:

deferred\_toolset\_agent.py

```python
from pydantic import BaseModel

from pydantic_ai import Agent, FunctionToolset

toolset = FunctionToolset()


@toolset.tool_plain
def get_default_language():
    return 'en-US'


@toolset.tool_plain
def get_user_name():
    return 'David'


class PersonalizedGreeting(BaseModel):
    greeting: str
    language_code: str


agent = Agent('openai:gpt-5.2', toolsets=[toolset], output_type=PersonalizedGreeting)

result = agent.run_sync('Greet the user in a personalized way')
print(repr(result.output))
#> PersonalizedGreeting(greeting='Hello, David!', language_code='en-US')
```

Next, let's define a function that represents a hypothetical "run agent" API endpoint that can be called by the frontend and takes a list of messages to send to the model, a list of frontend tool definitions, and optional deferred tool results. This is where `ExternalToolset`, `DeferredToolRequests`, and `DeferredToolResults` come in:

deferred\_toolset\_api.py

```python
from pydantic_ai import (
  DeferredToolRequests,
  DeferredToolResults,
  ExternalToolset,
  ModelMessage,
  ToolDefinition,
)

from deferred_toolset_agent import PersonalizedGreeting, agent


def run_agent(
  messages: list[ModelMessage] = [],
  frontend_tools: list[ToolDefinition] = {},
  deferred_tool_results: DeferredToolResults | None = None,
) -> tuple[PersonalizedGreeting | DeferredToolRequests, list[ModelMessage]]:
  deferred_toolset = ExternalToolset(frontend_tools)
  result = agent.run_sync(
      toolsets=[deferred_toolset], # (1)
      output_type=[agent.output_type, DeferredToolRequests], # (2)
      message_history=messages, # (3)
      deferred_tool_results=deferred_tool_results,
  )
  return result.output, result.new_messages()
```

As mentioned in the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation, these `toolsets` are additional to those provided to the `Agent` constructor

As mentioned in the [Deferred Tools](/docs/ai/tools-toolsets/deferred-tools#deferred-tools) documentation, this `output_type` overrides the one provided to the `Agent` constructor, so we have to make sure to not lose it

We don't include an `user_prompt` keyword argument as we expect the frontend to provide it via `messages`

Now, imagine that the code below is implemented on the frontend, and `run_agent` stands in for an API call to the backend that runs the agent. This is where we actually execute the deferred tool calls and start a new run with the new result included:

deferred\_tools.py

```python
from pydantic_ai import (
  DeferredToolRequests,
  DeferredToolResults,
  ModelMessage,
  ModelRequest,
  ModelRetry,
  ToolDefinition,
  UserPromptPart,
)

from deferred_toolset_api import run_agent

frontend_tool_definitions = [
  ToolDefinition(
      name='get_preferred_language',
      parameters_json_schema={'type': 'object', 'properties': {'default_language': {'type': 'string'}}},
      description="Get the user's preferred language from their browser",
  )
]

def get_preferred_language(default_language: str) -> str:
  return 'es-MX' # (1)

frontend_tool_functions = {'get_preferred_language': get_preferred_language}

messages: list[ModelMessage] = [
  ModelRequest(
      parts=[
          UserPromptPart(content='Greet the user in a personalized way')
      ]
  )
]

deferred_tool_results: DeferredToolResults | None = None

final_output = None
while True:
  output, new_messages = run_agent(messages, frontend_tool_definitions, deferred_tool_results)
  messages += new_messages

  if not isinstance(output, DeferredToolRequests):
      final_output = output
      break

  print(output.calls)
  """
  [
      ToolCallPart(
          tool_name='get_preferred_language',
          args={'default_language': 'en-US'},
          tool_call_id='pyd_ai_tool_call_id',
      )
  ]
  """
  deferred_tool_results = DeferredToolResults()
  for tool_call in output.calls:
      if function := frontend_tool_functions.get(tool_call.tool_name):
          result = function(**tool_call.args_as_dict())
      else:
          result = ModelRetry(f'Unknown tool {tool_call.tool_name!r}')
      deferred_tool_results.calls[tool_call.tool_call_id] = result

print(repr(final_output))
"""
PersonalizedGreeting(greeting='Hola, David! Espero que tengas un gran día!', language_code='es-MX')
"""
```

Imagine that this returns the frontend [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/language).

_(This example is complete, it can be run "as is")_

## Dynamically Building a Toolset

Toolsets can be built dynamically ahead of each agent run or run step using a function that takes the agent [run context](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) and returns a toolset or `None`. This is useful when a toolset (like an MCP server) depends on information specific to an agent run, like its [dependencies](/docs/ai/core-concepts/dependencies).

To register a dynamic toolset, you can pass a function that takes [`RunContext`](/docs/ai/api/pydantic-ai/tools/#pydantic_ai.tools.RunContext) to the `toolsets` argument of the `Agent` constructor, or you can wrap a compliant function in the [`@agent.toolset`](/docs/ai/api/pydantic-ai/agent/#pydantic_ai.agent.Agent.toolset) decorator.

By default, the function will be called again ahead of each agent run step. If you are using the decorator, you can optionally provide a `per_run_step=False` argument to indicate that the toolset only needs to be built once for the entire run.

dynamic\_toolset.py

```python
from dataclasses import dataclass
from typing import Literal

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.test import TestModel

from function_toolset import datetime_toolset, weather_toolset


@dataclass
class ToggleableDeps:
  active: Literal['weather', 'datetime']

  def toggle(self):
      if self.active == 'weather':
          self.active = 'datetime'
      else:
          self.active = 'weather'

test_model = TestModel()  # (1)
agent = Agent(
  test_model,
  deps_type=ToggleableDeps  # (2)
)

@agent.toolset
def toggleable_toolset(ctx: RunContext[ToggleableDeps]):
  if ctx.deps.active == 'weather':
      return weather_toolset
  else:
      return datetime_toolset

@agent.tool
def toggle(ctx: RunContext[ToggleableDeps]):
  ctx.deps.toggle()

deps = ToggleableDeps('weather')

result = agent.run_sync('Toggle the toolset', deps=deps)
print([t.name for t in test_model.last_model_request_parameters.function_tools])  # (3)
#> ['toggle', 'now']

result = agent.run_sync('Toggle the toolset', deps=deps)
print([t.name for t in test_model.last_model_request_parameters.function_tools])
#> ['toggle', 'temperature_celsius', 'temperature_fahrenheit', 'conditions']
```

We're using [`TestModel`](/docs/ai/api/models/test/#pydantic_ai.models.test.TestModel) here because it makes it easy to see which tools were available on each run.

We're using the agent's dependencies to give the `toggle` tool access to the `active` via the `RunContext` argument.

This shows the available tools _after_ the `toggle` tool was executed, as the "last model request" was the one that returned the `toggle` tool result to the model.

_(This example is complete, it can be run "as is")_

## Building a Custom Toolset

To define a fully custom toolset with its own logic to list available tools and handle them being called, you can subclass [`AbstractToolset`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset) and implement the [`get_tools()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_tools) and [`call_tool()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.call_tool) methods.

You can also override the [`get_instructions()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.get_instructions) method to provide a description of how to use the toolset's tools. This will be injected into the agent's instructions and is useful for helping the model understand how to effectively use your toolset's tools.

Tip

If your toolset also needs to provide model settings or hooks, consider building a [custom capability](/docs/ai/core-concepts/capabilities#building-custom-capabilities) instead.

The toolset lifecycle provides hooks for managing state at different scopes:

-   [`for_run()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run): Called once before each agent run. Return a fresh instance for per-run state isolation (e.g. resetting counters, creating a new session). The framework enters and exits the returned instance.
-   [`for_run_step()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run_step): Called at the start of each run step. Return a modified instance for per-step state transitions. If managing inner toolset transitions (e.g. swapping one toolset for another), you are responsible for the inner lifecycle (exiting the old, entering the new).
-   [`__aenter__()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.__aenter__) and [`__aexit__()`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.__aexit__): Set up and tear down resources (e.g. network connections) that should live for the duration of the agent run.

### Per-run and per-step lifecycle

Toolsets support lifecycle hooks for per-run isolation and per-step state management:

-   [`for_run(ctx)`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run) -- called once per agent run, before `__aenter__`. Return a fresh instance to isolate state between runs. Default: returns `self`.
-   [`for_run_step(ctx)`](/docs/ai/api/pydantic-ai/toolsets/#pydantic_ai.toolsets.AbstractToolset.for_run_step) -- called at the start of each run step. Manage internal transitions (e.g. refreshing tool availability) in-place. Default: returns `self`.

## Third-Party Toolsets

Third-party toolsets can also be wrapped as [capabilities](/docs/ai/core-concepts/capabilities), which bundle tools with hooks, instructions, and model settings. See [Extensibility](/docs/ai/guides/extensibility) for the full ecosystem.

### MCP Servers

Pydantic AI provides two toolsets that allow an agent to connect to and call tools on local and remote MCP Servers:

1.  `MCPServer`: the [MCP SDK-based Client](/docs/ai/mcp/client) which offers more direct control by leveraging the MCP SDK directly
2.  `FastMCPToolset`: the [FastMCP-based Client](/docs/ai/mcp/fastmcp-client) which offers additional capabilities like Tool Transformation, simpler OAuth configuration, and more.

### Agent Skills

Toolsets that implement [Agent Skills](https://agentskills.io) support so agents can efficiently discover and perform specific tasks:

-   [`pydantic-ai-skills`](https://github.com/DougTrajano/pydantic-ai-skills) - `SkillsToolset` implements Agent Skills support with progressive disclosure (load skills on-demand to reduce tokens). Supports filesystem and programmatic skills; compatible with [agentskills.io](https://agentskills.io).

### Task Management

Toolsets for task planning and progress tracking help agents organize complex work and provide visibility into agent progress:

-   [`pydantic-ai-todo`](https://github.com/vstorm-co/pydantic-ai-todo) - `TodoToolset` with `read_todos` and `write_todos` tools. Included in the third-party [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) [deep agent](/docs/ai/guides/multi-agent-applications#deep-agents) framework.

### File Operations

Toolsets for file operations help agents read, write, and edit files:

-   [`pydantic-ai-filesystem-sandbox`](https://github.com/zby/pydantic-ai-filesystem-sandbox) - `FileSystemToolset` with a sandbox and LLM-friendly errors
-   [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) -- Deep agent framework that includes a `FilesystemToolset` with multiple backends (in-memory, real filesystem, Docker sandbox).

### Code Execution

Toolsets for sandboxed code execution help agents run code in a sandboxed environment:

-   [`mcp-run-python`](https://github.com/pydantic/mcp-run-python) - MCP server by the Pydantic team that runs Python code in a sandboxed environment. Can be used as `MCPServerStdio('uv', args=['run', 'mcp-run-python', 'stdio'])`.

### LangChain Tools

If you'd like to use tools or a [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits) from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the [`LangChainToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.langchain.LangChainToolset) which takes a list of LangChain tools. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid.

You will need to install the `langchain-community` package and any others required by the tools in question.

```python
from langchain_community.agent_toolkits import SlackToolkit

from pydantic_ai import Agent
from pydantic_ai.ext.langchain import LangChainToolset

toolkit = SlackToolkit()
toolset = LangChainToolset(toolkit.get_tools())

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
# ...
```

### ACI.dev Tools

If you'd like to use tools from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the [`ACIToolset`](/docs/ai/api/pydantic-ai/ext/#pydantic_ai.ext.aci.ACIToolset) [toolset](/docs/ai/tools-toolsets/toolsets) which takes a list of ACI tool names as well as the `linked_account_owner_id`. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid.

You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function.

```python
import os

from pydantic_ai import Agent
from pydantic_ai.ext.aci import ACIToolset

toolset = ACIToolset(
    [
        'OPEN_WEATHER_MAP__CURRENT_WEATHER',
        'OPEN_WEATHER_MAP__FORECAST',
    ],
    linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'),
)

agent = Agent('openai:gpt-5.2', toolsets=[toolset])
```